Files
obsidian/phd/weekly-report/24/241215.md

375 B

Objective

  • Serverless KVCache cache

Key Results

  • Test a workload aware KVCache scheduler
  • Implement the workload aware policy in vLLM

Last Week

  • Design a workload aware schedule policy in simulator and profile the KVCache reuse rate.
  • Implement the designed policy under vLLM.

Next Week

  • Profile the real performance of new policy under vLLM and do some enhancement.