Initial commit: obsidian to gitea
This commit is contained in:
17
phd/weekly-report/25/250608.md
Normal file
17
phd/weekly-report/25/250608.md
Normal file
@@ -0,0 +1,17 @@
|
||||
Objectives
|
||||
- Serverless KVCache cache
|
||||
- MoE autoscaling
|
||||
|
||||
Key Results
|
||||
- [10/10] Refine a final version of KV$ cache for ATC'25
|
||||
- [8/10] Run MoE model in Ali
|
||||
- [0/10] Analysis the pattern of experts loading in Ali trace
|
||||
- [0/10] Understand how EP influence performance fully
|
||||
|
||||
Last Week
|
||||
- Modify vLLM to support tracing the activated experts and test on Ali trace with Qwen3-32B.
|
||||
- Prepare and submit KV$ cache to arXiv.
|
||||
|
||||
Next Week
|
||||
- Analysis the experts pattern.
|
||||
- Test on more MoE models.
|
||||
Reference in New Issue
Block a user