694 B
694 B
Objectives
- MoE pattern feature
- EP design for inference performance
Key Results
- [6/10] Survey MoE works and their observations
- [9/10] Analysis experts load balance's temporal locality
- [4/10] Analysis correlations between MoE layers
- [0/10] Understand how EP influence performance fully
- [0/10] Verify how dynamic EP influence performance
Last Week
- Survey the infrastructure of Bailian, specially in model serving and batching.
- Give a KVCache cache talk in Ali w/ Jinbo.
- Review 2 papers as shadow PC.
- Survey the agent workflow for potential system problem.
Next Week
- Survey the different parallelism setup scheduling.
- Review and write comments for all assigned papers.