diff --git a/REPORT.md b/REPORT.md
index 8c96df4..25745da 100644
--- a/REPORT.md
+++ b/REPORT.md
@@ -10,6 +10,26 @@
 
 For agentic LLM workloads (long input, short output, high KV cache reuse), is prefill-decode disaggregation beneficial? If full PD separation hurts (proven in §3), can **selective** disaggregation of only heavy requests improve serving latency while preserving KV cache locality?
 
+## 1.1 Errata / Superseded sections
+
+> This report has been revised several times as the methodology matured.
+> The sections below are kept for historical context but their numerical
+> conclusions have been **superseded** — do not cite them in isolation.
+>
+> - **§3.1 (initial PD-sep vs PD-combined)**: ran with the old random
+>   sampler + `--time-scale` compression + `--max-inflight-sessions 8`.
+>   Cross-session KV reuse dropped from 52% → 16%, and per-GPU concurrency
+>   was capped at 1 req/GPU. Superseded by §3.6.
+> - **Earlier "elastic v3" warm-vs-fresh runs**: baselines were not
+>   restarted between trials, leaving residual KV cache that inflated
+>   baseline TTFT ~2×. Superseded by the cold-start results in §3.6/§3.7.
+> - **Any reference to running `--max-inflight-sessions 64+`**: that flag
+>   was removed when replay moved to trace-driven dispatch. The next-step
+>   experiment requires restoring the flag first (see `FIXES.md` §B2
+>   route A) before any production-concurrency numbers can be produced.
+>
+> The authoritative results are in **§3.6 and §3.7**.
+
 ## 2. Experimental Setup
 
 ### 2.1 Hardware