Files
agentic-kvc/figs/mb5/CORRECTION.md
Gahow Wang a2111b6e18 PD-disagg docs: annotated corrections for e13391e contamination
Adds dated, non-destructive correction notes to the contaminated PD-vs-colo
artifacts after the producer-eviction bug (`evict_blocks(sent_block_ids)` on
`finished_sending`, deployed over the "fresh" pip vLLM by
`scripts/deploy_vllm_patches.sh`) was found and gated behind
`VLLM_EVICT_SENT_BLOCKS` (default off).

  PD_DISAGG_RESULTS.md  top CORRECTION banner + §6 RETRACTED marker.
                        §6 (session-affinity hot-pin) was an `e13391e`
                        artifact under controlled concurrency; §3 RR, §4
                        TPOT win, §5 D-pool ceiling, §5.1 consumer crash
                        stand.
  RESULTS_SUMMARY.md    §4 confirm+refine note: clean ablation confirms
                        the D-pool capacity thesis and adds regime-
                        dependence.
  pd_separation_analysis.md  scoped caution: thesis confirmed; flags
                        only reuse-dependent figures for cross-check
                        (this study used a different stack).
  figs/mb5/CORRECTION.md  flags mb5_producer_hotspot.png as retracted;
                        §3 RR and §5 D-pool figures stand.
2026-05-31 20:14:14 +08:00

1.7 KiB
Raw Permalink Blame History

⚠️ Correction notice for figs/mb5/ (2026-05-30)

These figures back microbench/fresh_setup/PD_DISAGG_RESULTS.md. A producer-side contamination was later found in the stack that produced them: commit e13391e (deployed over the "fresh" pip vLLM by scripts/deploy_vllm_patches.sh) evicts a producer's prefix-cache blocks on every KV transfer, so a disaggregated producer could never keep a session's prefix warm. It is now gated behind VLLM_EVICT_SENT_BLOCKS (default off) and everything was re-run clean.

figure section status
mb5_producer_hotspot.png §6.3 session-affinity hot-pinning 🛑 RETRACTED — pure e13391e artifact. On the clean stack, session-routed producers reach APC parity with colo (7182%); there is no 0%-util stall / hot-pin pathology.
mb5_latency_compare.png §3 round-robin headline stands — RR's ~0% prefix-hit is a routing artifact (consecutive turns → different producers), not the eviction bug; reproduced clean.
mb5_kv_timeline.png, mb5_role_split.png, mb5_peak_utilization.png §5 per-role KV pool occupancy D-pool capacity-ceiling mechanism stands (decode pegs while prefill strands). P-pool occupancy may read slightly low under eviction; the qualitative split is unaffected.
mb5_summary.csv aggregate mixed — §3/§5 rows valid; any session-affinity rows superseded.

Superseded by the corrected three-axis ablation: ../mb5_pd_ablation/ (reuse / shape / concurrency), data in ../../analysis/mb5_pd_ablation/. Raw §6 data analysis/mb5/session_prod.json is contaminated; analysis/mb5/rr_prod.json (round-robin) stands.