16 lines
848 B
Markdown
16 lines
848 B
Markdown
Objective
|
|
- Workload-centric KV cache scheduling
|
|
- XPURemoting adaption for PhOS
|
|
|
|
Key Results
|
|
- Define the Good KVCache hit rate in different conditions [6/10]
|
|
- Prove the interference between different workloads in current vLLM
|
|
- Modify XPURemoting to support PhOS (v1)
|
|
|
|
Last Week
|
|
- Search different KVCache schedule algorithms and sumarize something common for definition of Good KVCache hit rate.
|
|
- Profile ali trace in vLLM and group them to prove interference.
|
|
- Adaption of XPURemoting to support current PhOS's API. And fully test implementation in PhOS's open source examples. [MR](https://ipads.se.sjtu.edu.cn:1312/scaleaisys/xpuremoting/-/merge_requests/25) for XPURemoting and [e80bf94](https://github.com/Gahow/PhoenixOS/commit/e80bf94075fcd6f53c97406dadfbe7f13fc16092) for PhOS.
|
|
|
|
Next Week
|
|
- Finish definetion of Good KVCache hit rate. |