Objectives - MoE pattern feature - EP design for inference performance Key Results - [6/10] Survey MoE works and their observations - [9/10] Analysis experts load balance's temporal locality - [4/10] Analysis correlations between MoE layers - [0/10] Understand how EP influence performance fully - [0/10] Verify how dynamic EP influence performance Last Week - Survey the infrastructure of Bailian, specially in model serving and batching. - Give a KVCache cache talk in Ali w/ Jinbo. - Review 2 papers as shadow PC. - Survey the agent workflow for potential system problem. Next Week - Survey the different parallelism setup scheduling. - Review and write comments for all assigned papers.