648 B
648 B
Objectives
- Serverless KVCache cache
- MoE pattern feature
- EP design for inference performance
Key Results
- [9/10] Prepare slides for ATC'25 presentation w/ Jinbo
- [6/10] Survey MoE works and their observations
- [9/10] Analysis experts load balance's temporal locality
- [0/10] Analysis correlations between MoE layers
- [0/10] Understand how EP influence performance fully
- [0/10] Verify how dynamic EP influence performance
Last Week
- Survey MoE works and summarize their key points.
- Refine KVCache slides w/ Jinbo.
- Nit: support Ali machine usage and give a landing doc.
Next Week
- Check the feasibility of EP combinatory method.