abde010b64e6441b25120da2bec1053d2487ef52
One-page distillation of what the paper can claim today, with figure /
data path next to each row. Sections:
1. Workload 性质 — intra-session reuse, skew, KV footprint
2. Dispatch Coupling — agentic vs chatbot inter-turn gap regime
3. 现有调度三类失败 — load-balance / static PD-disagg / pure sticky
4. PD-disagg cost vs benefit — MB2 (transfer 9.7 GB/s ceiling,
topology-independent) + MB1 (decode halted during prefill 15-200x),
joined into the §3.2 cost > benefit headline for any KV ≥ 80 MiB
5. EAR 实证状态 — Pillar 1 (affinity) validated, Pillar 2 (migration)
substrate validated + strategy-layer pending
6. 已能写的 paper 主张(按 confidence 排序)
7. 待做(MB3-5, migration e2e, wall-clock sweep, scale-out)
Designed to be the one doc to read when re-entering the project after
a break.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Description
No description provided
Languages
Python
82.9%
Shell
17.1%