Files
obsidian/phd/weekly-report/24/241215.md

13 lines
375 B
Markdown

Objective
- Serverless KVCache cache
Key Results
- Test a workload aware KVCache scheduler
- Implement the workload aware policy in vLLM
Last Week
- Design a workload aware schedule policy in simulator and profile the KVCache reuse rate.
- Implement the designed policy under vLLM.
Next Week
- Profile the real performance of new policy under vLLM and do some enhancement.