Files
obsidian/phd/weekly-report/24/241201.md

16 lines
848 B
Markdown

Objective
- Workload-centric KV cache scheduling
- XPURemoting adaption for PhOS
Key Results
- Define the Good KVCache hit rate in different conditions [6/10]
- Prove the interference between different workloads in current vLLM
- Modify XPURemoting to support PhOS (v1)
Last Week
- Search different KVCache schedule algorithms and sumarize something common for definition of Good KVCache hit rate.
- Profile ali trace in vLLM and group them to prove interference.
- Adaption of XPURemoting to support current PhOS's API. And fully test implementation in PhOS's open source examples. [MR](https://ipads.se.sjtu.edu.cn:1312/scaleaisys/xpuremoting/-/merge_requests/25) for XPURemoting and [e80bf94](https://github.com/Gahow/PhoenixOS/commit/e80bf94075fcd6f53c97406dadfbe7f13fc16092) for PhOS.
Next Week
- Finish definetion of Good KVCache hit rate.