obsidian/phd/weekly-report/25/250601.md

Objectives
- Serverless KVCache cache
- MoE autoscaling

Key Results
- [10/10] Refine a final version of KV$ cache for ATC'25
- [10/10] Graduation thesis defense
- [2/10] Run MoE model in Ali
- [0/10] Analysis the pattern of experts loading in Ali trace

Last Week
- Prepare and finish graduation defense.
- Polish the final version of KV$ cache and send to the shepherd.
- Run Qwen3-32B on latest vLLM.

Next Week
- Modify vLLM to support tracing the expert load pattern.