800 B
800 B
Objectives
- Serverless KVCache cache
- MoE pattern feature
- EP design for inference performance
Key Results
- [5/10] Prepare slides for ATC'25 presentation w/ Jinbo
- [1/10] Survey MoE works and their observations
- [9/10] Analysis experts load balance's temporal locality
- [0/10] Analysis correlations between MoE layers
- [0/10] Understand how EP influence performance fully
- [0/10] Verify how dynamic EP influence performance
Last Week
- Tracing expert pattern with Qwen trace under Qwen3-235B and DeepSeek-671B.
- Analysis expert pattern's temporal locality in large models (Qwen3-235B and DeepSeek-671B).
- Prepare KVCache slides.
- All misc for graduation.
Next Week
- Analysis the expert pattern's correlations between layers.
- Survey current MoE works for more observations to check.