Objectives - Serverless KVCache cache - MoE pattern feature - EP design for inference performance Key Results - [5/10] Prepare slides for ATC'25 presentation w/ Jinbo - [1/10] Survey MoE works and their observations - [9/10] Analysis experts load balance's temporal locality - [0/10] Analysis correlations between MoE layers - [0/10] Understand how EP influence performance fully - [0/10] Verify how dynamic EP influence performance Last Week - Tracing expert pattern with Qwen trace under Qwen3-235B and DeepSeek-671B. - Analysis expert pattern's temporal locality in large models (Qwen3-235B and DeepSeek-671B). - Prepare KVCache slides. - All misc for graduation. Next Week - Analysis the expert pattern's correlations between layers. - Survey current MoE works for more observations to check.