Objectives - Serverless KVCache cache - MoE pattern feature - EP design for inference performance Key Results - [9/10] Prepare slides for ATC'25 presentation w/ Jinbo - [6/10] Survey MoE works and their observations - [9/10] Analysis experts load balance's temporal locality - [0/10] Analysis correlations between MoE layers - [0/10] Understand how EP influence performance fully - [0/10] Verify how dynamic EP influence performance Last Week - Survey MoE works and summarize their key points. - Refine KVCache slides w/ Jinbo. - Nit: support Ali machine usage and give a landing doc. Next Week - Check the feasibility of EP combinatory method.