obsidian/250608.md at main - obsidian - Local Gitea

gahow/obsidian

Files

Gahow Wang a57afa86b4 Initial commit: obsidian to gitea

2026-05-07 15:04:41 +08:00

488 B

Raw Permalink Blame History

Objectives

Serverless KVCache cache
MoE autoscaling

Key Results

[10/10] Refine a final version of KV$ cache for ATC'25
[8/10] Run MoE model in Ali
[0/10] Analysis the pattern of experts loading in Ali trace
[0/10] Understand how EP influence performance fully

Last Week

Modify vLLM to support tracing the activated experts and test on Ali trace with Qwen3-32B.
Prepare and submit KV$ cache to arXiv.

Next Week

Analysis the experts pattern.
Test on more MoE models.