PD-disagg docs: annotated corrections for e13391e contamination
Adds dated, non-destructive correction notes to the contaminated PD-vs-colo
artifacts after the producer-eviction bug (`evict_blocks(sent_block_ids)` on
`finished_sending`, deployed over the "fresh" pip vLLM by
`scripts/deploy_vllm_patches.sh`) was found and gated behind
`VLLM_EVICT_SENT_BLOCKS` (default off).
PD_DISAGG_RESULTS.md top CORRECTION banner + §6 RETRACTED marker.
§6 (session-affinity hot-pin) was an `e13391e`
artifact under controlled concurrency; §3 RR, §4
TPOT win, §5 D-pool ceiling, §5.1 consumer crash
stand.
RESULTS_SUMMARY.md §4 confirm+refine note: clean ablation confirms
the D-pool capacity thesis and adds regime-
dependence.
pd_separation_analysis.md scoped caution: thesis confirmed; flags
only reuse-dependent figures for cross-check
(this study used a different stack).
figs/mb5/CORRECTION.md flags mb5_producer_hotspot.png as retracted;
§3 RR and §5 D-pool figures stand.
This commit is contained in:
19
figs/mb5/CORRECTION.md
Normal file
19
figs/mb5/CORRECTION.md
Normal file
@@ -0,0 +1,19 @@
|
||||
# ⚠️ Correction notice for figs/mb5/ (2026-05-30)
|
||||
|
||||
These figures back `microbench/fresh_setup/PD_DISAGG_RESULTS.md`. A producer-side
|
||||
contamination was later found in the stack that produced them: commit **`e13391e`**
|
||||
(deployed over the "fresh" pip vLLM by `scripts/deploy_vllm_patches.sh`) evicts a
|
||||
producer's prefix-cache blocks on every KV transfer, so a disaggregated producer
|
||||
could never keep a session's prefix warm. It is now gated behind
|
||||
`VLLM_EVICT_SENT_BLOCKS` (default off) and everything was re-run clean.
|
||||
|
||||
| figure | section | status |
|
||||
|---|---|---|
|
||||
| `mb5_producer_hotspot.png` | §6.3 session-affinity hot-pinning | 🛑 **RETRACTED** — pure `e13391e` artifact. On the clean stack, session-routed producers reach APC parity with colo (71–82%); there is no 0%-util stall / hot-pin pathology. |
|
||||
| `mb5_latency_compare.png` | §3 round-robin headline | ✅ stands — RR's ~0% prefix-hit is a *routing* artifact (consecutive turns → different producers), not the eviction bug; reproduced clean. |
|
||||
| `mb5_kv_timeline.png`, `mb5_role_split.png`, `mb5_peak_utilization.png` | §5 per-role KV pool occupancy | ✅ D-pool capacity-ceiling mechanism stands (decode pegs while prefill strands). P-pool occupancy may read slightly low under eviction; the qualitative split is unaffected. |
|
||||
| `mb5_summary.csv` | aggregate | mixed — §3/§5 rows valid; any session-affinity rows superseded. |
|
||||
|
||||
**Superseded by the corrected three-axis ablation:** [`../mb5_pd_ablation/`](../mb5_pd_ablation/)
|
||||
(reuse / shape / concurrency), data in [`../../analysis/mb5_pd_ablation/`](../../analysis/mb5_pd_ablation/).
|
||||
Raw §6 data `analysis/mb5/session_prod.json` is contaminated; `analysis/mb5/rr_prod.json` (round-robin) stands.
|
||||
Reference in New Issue
Block a user