Objectives - Serverless KVCache cache - MoE autoscaling Key Results - [10/10] Refine a final version of KV$ cache for ATC'25 - [8/10] Run MoE model in Ali - [0/10] Analysis the pattern of experts loading in Ali trace - [0/10] Understand how EP influence performance fully Last Week - Modify vLLM to support tracing the activated experts and test on Ali trace with Qwen3-32B. - Prepare and submit KV$ cache to arXiv. Next Week - Analysis the experts pattern. - Test on more MoE models.