Initial commit: obsidian to gitea
This commit is contained in:
20
phd/weekly-report/25/250615.md
Normal file
20
phd/weekly-report/25/250615.md
Normal file
@@ -0,0 +1,20 @@
|
||||
Objectives
|
||||
- Serverless KVCache cache
|
||||
- MoE pattern feature
|
||||
- EP design for inference performance
|
||||
|
||||
Key Results
|
||||
- [0/10] Prepare slides for ATC'25 presentation w/ Jinbo
|
||||
- [8/10] Run MoE models in Ali
|
||||
- [5/10] Analysis the pattern of experts loading in Ali trace
|
||||
- [3/10] Analysis the expert pattern in different models
|
||||
- [0/10] Understand how EP influence performance fully
|
||||
- [0/10] Verify how dynamic EP influence performance
|
||||
|
||||
Last Week
|
||||
- Develop in vLLM to support tracing expert pattern with PP and distributed with Ray for DeepSeek-671B.
|
||||
- Analysis expert pattern's temporal locality.
|
||||
|
||||
Next Week
|
||||
- Develop in vLLM fully for all models.
|
||||
- Analysis the expert pattern's correlations between layers.
|
||||
Reference in New Issue
Block a user