Remove 'capped' references from MEETING.md and PAPER_OUTLINE.md prose
Companion to the figure cleanup: prose in §3.1 was still quoting "capped 31.6% APC" as one of the failure-mode datapoints. Same reason as the figures — capped is a workload manipulation, not a policy, so it doesn't belong in the §3.1 routing-policy narrative. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -39,7 +39,7 @@ L = Λ · N · W_turn(L) # agentic, T_human≈0
|
||||
|
||||

|
||||
|
||||
LMetric 56.9%、load_only 54.1%、capped 31.6% APC,远低于 79.6% 上界。23pp 缺口直接来自跨 instance 路由丢的 intra-session hit。
|
||||
LMetric 56.9%、load_only 54.1% APC,远低于 79.6% 上界。23pp 缺口直接来自跨 instance 路由丢的 intra-session hit。
|
||||
|
||||
### 静态 PD-disagg:D 侧 KV 容量墙
|
||||
|
||||
|
||||
@@ -145,7 +145,7 @@ Round-robin 和 load-aware routing(如 LMetric, OSDI'26)最大化 instance
|
||||
|
||||
**Figure 4: Three baselines, three failure modes** — 拆成三个子图,分别放在 §3.1/§3.2/§3.3:
|
||||
|
||||
§3.1 — APC 实测 vs 理论上界 79.6% (lmetric 56.9%, load_only 54.1%, capped 31.6%, sticky 77.2%, unified 79.4%):
|
||||
§3.1 — APC 实测 vs 理论上界 79.6% (lmetric 56.9%, load_only 54.1%, sticky 77.2%, unified 79.4%):
|
||||

|
||||
|
||||
§3.2 — D 侧 KV pool 占用 vs per-request KV footprint,4P+4D 和 6P+2D 在 agentic regime 都穿过 90% 内存墙:
|
||||
|
||||
Reference in New Issue
Block a user