Files
obsidian/phd/weekly-report/24/241201.md

848 B

Objective

  • Workload-centric KV cache scheduling
  • XPURemoting adaption for PhOS

Key Results

  • Define the Good KVCache hit rate in different conditions [6/10]
  • Prove the interference between different workloads in current vLLM
  • Modify XPURemoting to support PhOS (v1)

Last Week

  • Search different KVCache schedule algorithms and sumarize something common for definition of Good KVCache hit rate.
  • Profile ali trace in vLLM and group them to prove interference.
  • Adaption of XPURemoting to support current PhOS's API. And fully test implementation in PhOS's open source examples. MR for XPURemoting and e80bf94 for PhOS.

Next Week

  • Finish definetion of Good KVCache hit rate.