23 lines
800 B
Markdown
23 lines
800 B
Markdown
Objectives
|
|
- Serverless KVCache cache
|
|
- MoE pattern feature
|
|
- EP design for inference performance
|
|
|
|
Key Results
|
|
- [5/10] Prepare slides for ATC'25 presentation w/ Jinbo
|
|
- [1/10] Survey MoE works and their observations
|
|
- [9/10] Analysis experts load balance's temporal locality
|
|
- [0/10] Analysis correlations between MoE layers
|
|
- [0/10] Understand how EP influence performance fully
|
|
- [0/10] Verify how dynamic EP influence performance
|
|
|
|
Last Week
|
|
- Tracing expert pattern with Qwen trace under Qwen3-235B and DeepSeek-671B.
|
|
- Analysis expert pattern's temporal locality in large models (Qwen3-235B and DeepSeek-671B).
|
|
- Prepare KVCache slides.
|
|
- All misc for graduation.
|
|
|
|
Next Week
|
|
- Analysis the expert pattern's correlations between layers.
|
|
- Survey current MoE works for more observations to check.
|