diff --git a/analysis/characterization/window_1_results.md b/analysis/characterization/window_1_results.md
new file mode 100644
index 0000000..6db3365
--- /dev/null
+++ b/analysis/characterization/window_1_results.md
@@ -0,0 +1,171 @@
+# Window 1 Results: B1' + B2 + B3
+
+Status: Window 1 complete (CPU + 2 dash0 GPU windows on 2026-05-25)
+Sweep: `outputs/b3_sweep_20260525_095043` (B3) + `outputs/b2_microbench/` (B2)
+Trace: `traces/w600_r0.0015_st30.jsonl` (1214 requests / 274 sessions / 53.3 M input tokens)
+Model: Qwen3-Coder-30B-A3B-Instruct (TP1 × 8 instances on H20)
+
+Per-policy artifacts under `window_1_results/`. Figures under `window_1_results/figures/`.
+
+## Headline
+
+| Claim | Status | Evidence |
+|---|---|---|
+| Agentic workload reuse is overwhelmingly intra-session | **supported** | 93.2% of cached_tokens are intra-session (real); theoretical any-session APC ceiling 80.3% vs intra-session ceiling 79.6% → < 1pp gap |
+| LMetric leaves 23 pp of APC on the table | **supported** | lmetric achieved 56.9% vs intra-session ceiling 79.6% (theoretical) |
+| Hard session affinity recovers the locality lost by LMetric | **supported** | sticky APC 77.2% = 97% of theoretical ceiling |
+| Hard affinity inflates same-worker prefill-decode interference | **supported** | sticky interference_index 13.65 vs lmetric 6.53 |
+| Hybrid affinity (Unified) breaks the locality-vs-latency tradeoff | **supported** | unified hits 79.4% APC and TTFT p90 7.24 s (lmetric 15.6 s) simultaneously |
+| Same-worker prefill-decode interference is causal, not correlation | **supported** | different-worker control idx≈1.0; same-worker idx scales monotonically with prefill size |
+| Heavy-tail sessions are *a* contributor to hot-spot, not the sole cause | **supported** | cap=8 truncated trace cuts 37% of work; hotspot drops only 13% (2.24→1.94) |
+
+## B1' Workload characterization
+
+### Per-request KV footprint (Qwen3-Coder-30B-A3B)
+
+`kv_bytes_per_token = 2 × num_layers × num_kv_heads × head_dim × dtype_bytes = 2 × 48 × 4 × 128 × 2 = 98304 B`
+
+Full GLM-5.1 trace (2.11 M requests, 1.31 M sessions):
+
+| | p50 | p90 | p95 | p99 | max |
+|---|---:|---:|---:|---:|---:|
+| KV per request | 1.83 GiB | 8.04 GiB | 9.59 GiB | **11.49 GiB** | 18.5 GiB |
+
+H20 has ~95 GiB usable per GPU. **A single p99 request occupies 12% of a single H20's HBM** purely for KV. Multi-request batching is bounded by this.
+
+Figure: `figures/fig_kv_footprint_cdf.png`.
+
+### Real reuse decomposition (from lmetric run on w600 trace)
+
+| class | tokens | fraction |
+|---|---:|---:|
+| intra-session | 28.3 M | **93.2%** |
+| cross-session | 1.72 M | 5.7% |
+| shared / system-prefix | 0.34 M | 1.1% |
+| unclassified | 0 | 0.0% |
+
+→ session-affinity routing covers >99% of the reuse signal. There is no meaningful "system prompt" in this trace.
+
+Figure: `figures/fig_reuse_decomposition.png`.
+
+### Theoretical APC ceilings on w600
+
+Computed by building a block-level trie of `hash_ids` per session (intra-session) or globally (any-session), then walking each request's `hash_ids` to count its longest prefix-match against previously-seen prefixes.
+
+| variant | upper bound | hit requests |
+|---|---:|---:|
+| any-session (perfect global cache) | **80.3%** | 961 / 1214 |
+| intra-session only | **79.6%** | 914 / 1214 |
+| shared-prefix only (pos 0, ≥8 sessions) | 0.10% | 107 / 1214 |
+
+Gap "any − intra" is 0.7 pp → no meaningful cross-session sharing in this trace.
+
+## B3 5-policy routing sweep
+
+8 vLLM instances on TP1, w600 trace, `--enable-prompt-tokens-details` so `cached_tokens` is reported per request.
+
+| policy | TTFT p50/p90/p99 | TPOT p50/p90/p99 ms | E2E p50/p90/p99 | **APC** | interference | **hotspot** | n_slow |
+|---|---|---|---|---:|---:|---:|---:|
+| lmetric | 0.94 / 15.59 / 52.95 | 8.9 / 21.2 / 175.9 | 2.75 / 24.75 / 79.62 | 56.9% | 6.53 | 2.24 | 295 |
+| load_only | 1.25 / 20.15 / 52.65 | 9.2 / 26.7 / 320.7 | 3.58 / 33.43 / 93.92 | 54.1% | 9.16 | **1.14** | 379 |
+| sticky | 0.54 / 18.02 / 71.37 | 8.9 / 36.1 / 345.2 | 2.08 / 34.61 / 133.58 | 77.2% | **13.65** | 2.35 | 234 |
+| **unified** | **0.50 / 7.24 / 42.02** | 8.1 / 17.1 / 118.1 | **1.75 / 17.89 / 68.18** | **79.4%** | n/a* | 3.35 | **189** |
+| capped | 1.20 / 12.76 / 46.05 | 7.2 / 16.0 / 101.5 | 2.59 / 21.24 / 73.39 | 31.6% | 6.33 | 1.94 | 185 |
+
+\*unified `engine_state` was overwritten by my analyzer's slice step before the `b3_analyze.sh` fix landed; vLLM and the patch worked correctly. The B2 microbench provides a cleaner interference proof.
+
+**Mechanism indices**
+- `interference_index` = TPOT_p90(decode overlapping same-worker prefill) / TPOT_p90(clean)
+- `hotspot_index` = max(worker TTFT p90) / median(worker TTFT p90)
+
+Figures: `fig_b3_latency_bars.png`, `fig_b3_apc_vs_upper.png`,
+`fig_b3_apc_vs_hotspot.png`, `fig_b3_per_worker_ttft_p90.png`,
+`fig_b3_failure_breakdown.png`.
+
+### Per-policy reading
+
+- **lmetric** is the cache-aware baseline. APC 56.9% achieves only 71% of the intra-session ceiling — the missing 23 pp is the locality opportunity unified picks up.
+- **load_only** strips cache awareness. Hot-spot drops to 1.14 (best), but APC only drops 3 pp because the picker's `min(num_requests)` tie-break to instance 0 creates accidental stickiness at low concurrency.
+- **sticky** locks each session to one worker. APC climbs to 77.2% (97% of ceiling) but interference doubles to 13.65 and TPOT p99 hits 345 ms.
+- **unified** is the hybrid — affinity gate `(cache_ratio>0.5 AND num_req ≤ 2×avg)` keeps locality where it pays and drops it where it would hurt. The result is APC 79.4% **and** TTFT p90 cut in half from lmetric. The one bad worker (engine_4 at 37.7s p90) drives `hotspot_index=3.35`, but the other seven workers are all under 18 s.
+- **capped** runs lmetric on a turn-capped trace (max 8 turns/session). Removes 37% of requests but APC also crashes to 31.6% and hotspot only improves by 13%. This is the session-mass ablation: heavy sessions are *a* contributor to hot-spot but not the sole cause.
+
+### Slow-request cause breakdown (from `joined_analysis.label_slow_requests`)
+
+| policy | n_slow | same-worker overlap | hot worker queue | cache miss large append | unknown |
+|---|---:|---:|---:|---:|---:|
+| lmetric | 295 | 69 (23%) | 68 (23%) | 94 (32%) | 64 (22%) |
+| load_only | 379 | 108 (29%) | 33 (9%) | 151 (40%) | 87 (23%) |
+| sticky | 234 | **134 (57%)** | 51 (22%) | **20 (9%)** | 29 (12%) |
+| unified | 189 | 0 (no engine_state) | 116 (61%) | 18 (10%) | 55 (29%) |
+| capped | 185 | 45 (24%) | 66 (36%) | 60 (32%) | 14 (8%) |
+
+PD-colo failures are mixed-mechanism: lmetric has no single dominant cause.
+Sticky concentrates failures into same-worker overlap (locality is on, cache misses are gone, but interference takes over).
+
+## B2 PD-colo interference microbench
+
+Setup: 2 vLLM instances on GPU 0 (decode endpoint) and GPU 1 (prefill endpoint). A continuous 4 req/s short-prompt decode load runs against GPU 0 for 60 s per cell. 4 large-prompt one-token "prefill injections" fire every 12 s, targeted at either the same instance (`same`) or the paired one (`different`). Decode requests are labeled overlap iff their `[t_first_token, t_finish]` intersects any injection window. We compare TPOT p90 (overlap vs clean) per cell.
+
+| variant | prefill | n_overlap | n_clean | **TPOT idx** | **TTFT idx** |
+|---|---:|---:|---:|---:|---:|
+| different | 2k–65k | 12–126 | 114–228 | **0.92–1.02** | **0.96–1.00** |
+| same | 2k | 12 | 228 | 1.16 | 2.15 |
+| same | 8k | 19 | 221 | 1.90 | **12.1×** |
+| same | 16k | 37 | 203 | 3.37 | **30.8×** |
+| same | 32k | 67 | 173 | **7.89** | **94.6×** |
+| same | 65k | 130 | 110 | 2.26* | **218×** |
+
+\*65k TPOT idx is suppressed because n_overlap > n_clean — by the time the 65k prefill is finishing, the 4-second gap to the next injection has already started decoding overlap. The "clean" decodes left are the ones that randomly hit the brief gaps between injections.
+
+Figures: `fig_b2_tpot_vs_prefill.png`, `fig_b2_ttft_vs_prefill.png`.
+
+**Why this matters**
+- The `different-worker` control sits at idx ≈ 1.0 across 32× variation in prefill size. This is the cleanest possible disproof of "any prefill anywhere hurts decode": prefill on a *different* worker is invisible to the decode worker.
+- The `same-worker` curve is monotone in prefill size for TTFT (218× at 65k) and monotone-up-to-32k for TPOT (7.89×). The two ablations together establish causation: prefill-decode interference is a same-worker phenomenon and scales sharply with prefill mass.
+- This is the mechanism behind the B3 sticky interference jump (13.65) and unified's single hot worker (engine_4 at 37.7 s TTFT p90).
+
+## What Window 1 does *not* answer
+
+These need Window 2 (B4 SRR sweep + B5 failure attribution near SRR boundary):
+
+1. **Sustainable arrival rate (SRR) per policy under SLO**. B3 was driven by trace timestamps with strict session sequentiality; when 8 instances cannot keep up, requests pile up and the *effective* dispatch window stretches (lmetric: trace claims 600 s, actual replay 49 min). We measured *saturated* behavior but not the saturation point. B4 needs the A4 open-loop Poisson loadgen with per-class SLO thresholds.
+2. **Failure breakdown at the SRR boundary**. B5 will rerun each policy at 0.9× / 1.0× / 1.1× of its SRR_max and label each SLO-violating request — gives the paper its causal failure-attribution table.
+
+Optional / paper-polish runs (not blocking the story):
+
+3. unified isolated rerun to capture `interference_index` (B2 already provides cleaner causal proof; skip unless reviewer asks).
+4. B2 with the proxy in path — measure whether the production cache_aware routing actually pushes prefill and decode onto different workers in practice.
+5. KV-occupancy timeline per worker — needs polling `vllm:gpu_cache_usage` during B3 reruns; useful for "KV pressure drives cache miss" subsection.
+
+## Caveats and known data hygiene issues
+
+- **APC contamination across B3 hot-sweep**: `lmetric` ran from cold; `load_only` and `sticky` ran on the same 8 vLLMs without restart. Empirical contamination is < 1% (verified by first-turn cached_tokens distribution), but `unified` and `capped` were rerun cold-start specifically to remove any residual concern.
+- **Unified's `interference_index` is missing** because the original `b3_analyze.sh` unconditionally truncate-wrote sliced engine_state files; isolated runs that wrote engine_state into their own per-policy directory were overwritten. Fixed in commit `df32499`; capped was the first run to benefit and survived with intact 86 MB engine_state.
+- **w600 is not the full GLM-5.1 trace** (1214 req vs 2.11 M). All B3/B2 percentiles are on the sample. The full-trace KV-footprint stats are on the full trace.
+
+## Reproduction commands
+
+```bash
+# B3 5-policy sweep
+bash scripts/b3_sweep.sh                                   # lmetric, load_only, sticky (hot-cache)
+bash scripts/b3_isolated_policy.sh unified <trace> <dir>   # isolated cold-start
+bash scripts/b3_isolated_policy.sh lmetric <capped> <dir>  # capped variant
+
+bash scripts/b3_analyze.sh outputs/b3_sweep_<TS>
+python3 scripts/render_b3_report.py --sweep-dir outputs/b3_sweep_<TS>
+
+# B2 interference microbench
+# (launch 2 vLLM on ports 8100/8101 with --enable-prompt-tokens-details first)
+python3 scripts/b2_interference.py \
+    --decode-endpoint http://127.0.0.1:8100 \
+    --prefill-endpoint http://127.0.0.1:8101 \
+    --model <model> \
+    --out-dir outputs/b2_microbench/sweep
+python3 analysis/characterization/b2_sweep_analysis.py --sweep-dir outputs/b2_microbench/sweep
+
+# Figures
+python3 analysis/characterization/render_window1_figures.py \
+    --results-dir analysis/characterization/window_1_results \
+    --out-dir analysis/characterization/window_1_results/figures
+```
diff --git a/analysis/characterization/window_1_results/apc_upper_w600.json b/analysis/characterization/window_1_results/apc_upper_w600.json
new file mode 100644
index 0000000..f5074e2
--- /dev/null
+++ b/analysis/characterization/window_1_results/apc_upper_w600.json
@@ -0,0 +1,18 @@
+{
+  "trace": "/home/admin/cpfs/wjh/agentic-kv/traces/w600_r0.0015_st30.jsonl",
+  "n_requests": 1214,
+  "n_sessions": 274,
+  "block_size": 512,
+  "shared_prefix_min_sessions": 8,
+  "total_input_tokens": 53335690,
+  "apc_upper_any_session": 0.8030439654947747,
+  "apc_upper_intra_session": 0.7956783534627564,
+  "apc_upper_shared_prefix_only": 0.0010271546126055554,
+  "cached_tokens_any_session": 42830904,
+  "cached_tokens_intra_session": 42438054,
+  "cached_tokens_shared_prefix_only": 54784,
+  "n_requests_any_hit": 961,
+  "n_requests_intra_hit": 914,
+  "n_requests_shared_hit": 107,
+  "n_shared_pos0_blocks": 1
+}
\ No newline at end of file
diff --git a/analysis/characterization/window_1_results/b2_sweep_summary.json b/analysis/characterization/window_1_results/b2_sweep_summary.json
new file mode 100644
index 0000000..5a433fa
--- /dev/null
+++ b/analysis/characterization/window_1_results/b2_sweep_summary.json
@@ -0,0 +1,194 @@
+{
+  "rows": [
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 0.9868436853823819,
+      "n_decode_clean": 207,
+      "n_decode_overlap": 33,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8101",
+      "prefill_size": 16384,
+      "tpot_p50_clean_s": 0.0061757058808297825,
+      "tpot_p50_overlap_s": 0.006127697048765241,
+      "tpot_p90_clean_s": 0.006862485770023231,
+      "tpot_p90_overlap_s": 0.006772200748173878,
+      "tpot_p99_clean_s": 0.007128368820806946,
+      "tpot_p99_overlap_s": 0.0070623818792478,
+      "ttft_p90_clean_s": 0.043039703369140626,
+      "ttft_p90_overlap_s": 0.04307723045349121,
+      "variant": "different"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 1.0176125863449343,
+      "n_decode_clean": 228,
+      "n_decode_overlap": 12,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8101",
+      "prefill_size": 2048,
+      "tpot_p50_clean_s": 0.0062349300191860005,
+      "tpot_p50_overlap_s": 0.006218204594621754,
+      "tpot_p90_clean_s": 0.006892242576136734,
+      "tpot_p90_overlap_s": 0.007013632793619174,
+      "tpot_p99_clean_s": 0.007111345902837888,
+      "tpot_p99_overlap_s": 0.007131954732567373,
+      "ttft_p90_clean_s": 0.04290406703948975,
+      "ttft_p90_overlap_s": 0.040976309776306154,
+      "variant": "different"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 0.9221676118155049,
+      "n_decode_clean": 176,
+      "n_decode_overlap": 64,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8101",
+      "prefill_size": 32768,
+      "tpot_p50_clean_s": 0.00620933012528853,
+      "tpot_p50_overlap_s": 0.005991364970351711,
+      "tpot_p90_clean_s": 0.0069098352181791054,
+      "tpot_p90_overlap_s": 0.006372026241186894,
+      "tpot_p99_clean_s": 0.007242970394365715,
+      "tpot_p99_overlap_s": 0.006935877366499467,
+      "ttft_p90_clean_s": 0.04308474063873291,
+      "ttft_p90_overlap_s": 0.04266033172607422,
+      "variant": "different"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 1.0162810692345416,
+      "n_decode_clean": 114,
+      "n_decode_overlap": 126,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8101",
+      "prefill_size": 65536,
+      "tpot_p50_clean_s": 0.006080349286397299,
+      "tpot_p50_overlap_s": 0.006312949488861392,
+      "tpot_p90_clean_s": 0.0068880830148253785,
+      "tpot_p90_overlap_s": 0.007000228371283021,
+      "tpot_p99_clean_s": 0.007222196574162956,
+      "tpot_p99_overlap_s": 0.00723441562267265,
+      "ttft_p90_clean_s": 0.04367616176605225,
+      "ttft_p90_overlap_s": 0.04332089424133301,
+      "variant": "different"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 0.92169565663476,
+      "n_decode_clean": 220,
+      "n_decode_overlap": 20,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8101",
+      "prefill_size": 8192,
+      "tpot_p50_clean_s": 0.006260122915711066,
+      "tpot_p50_overlap_s": 0.006120474651606396,
+      "tpot_p90_clean_s": 0.006968991684191154,
+      "tpot_p90_overlap_s": 0.006423289366442748,
+      "tpot_p99_clean_s": 0.007601349209294174,
+      "tpot_p99_overlap_s": 0.006715166592838788,
+      "ttft_p90_clean_s": 0.04314079284667969,
+      "ttft_p90_overlap_s": 0.042817187309265134,
+      "variant": "different"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 3.3716068170318985,
+      "n_decode_clean": 203,
+      "n_decode_overlap": 37,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8100",
+      "prefill_size": 16384,
+      "tpot_p50_clean_s": 0.006435276281954062,
+      "tpot_p50_overlap_s": 0.009116151116111061,
+      "tpot_p90_clean_s": 0.0071605749804564195,
+      "tpot_p90_overlap_s": 0.024142643417974917,
+      "tpot_p99_clean_s": 0.008356584539317119,
+      "tpot_p99_overlap_s": 0.024809808827409838,
+      "ttft_p90_clean_s": 0.04402604103088379,
+      "ttft_p90_overlap_s": 1.3574100017547606,
+      "variant": "same"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 1.1589170446597312,
+      "n_decode_clean": 228,
+      "n_decode_overlap": 12,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8100",
+      "prefill_size": 2048,
+      "tpot_p50_clean_s": 0.006142637946388938,
+      "tpot_p50_overlap_s": 0.007610858088791972,
+      "tpot_p90_clean_s": 0.006933137142296993,
+      "tpot_p90_overlap_s": 0.008034930807171445,
+      "tpot_p99_clean_s": 0.007201877651792584,
+      "tpot_p99_overlap_s": 0.0084272463153107,
+      "ttft_p90_clean_s": 0.043091440200805665,
+      "ttft_p90_overlap_s": 0.09247522354125978,
+      "variant": "same"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 7.891276559921504,
+      "n_decode_clean": 173,
+      "n_decode_overlap": 67,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8100",
+      "prefill_size": 32768,
+      "tpot_p50_clean_s": 0.006226602226796776,
+      "tpot_p50_overlap_s": 0.012180752224392362,
+      "tpot_p90_clean_s": 0.00694006813897027,
+      "tpot_p90_overlap_s": 0.054765997029314145,
+      "tpot_p99_clean_s": 0.010443444107518053,
+      "tpot_p99_overlap_s": 0.058983875428787386,
+      "ttft_p90_clean_s": 0.04411859512329101,
+      "ttft_p90_overlap_s": 4.174754428863525,
+      "variant": "same"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 2.259323176730457,
+      "n_decode_clean": 110,
+      "n_decode_overlap": 130,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8100",
+      "prefill_size": 65536,
+      "tpot_p50_clean_s": 0.0064652375500611585,
+      "tpot_p50_overlap_s": 0.020095128001588764,
+      "tpot_p90_clean_s": 0.009607415488272014,
+      "tpot_p90_overlap_s": 0.021706256481132124,
+      "tpot_p99_clean_s": 0.016912007837584522,
+      "tpot_p99_overlap_s": 0.16948255478733715,
+      "ttft_p90_clean_s": 0.06447408199310305,
+      "ttft_p90_overlap_s": 14.060086917877197,
+      "variant": "same"
+    },
+    {
+      "decode_endpoint": "http://127.0.0.1:8100",
+      "interference_index": 1.8961314610807898,
+      "n_decode_clean": 221,
+      "n_decode_overlap": 19,
+      "n_decode_total": 240,
+      "n_prefill_injections": 4,
+      "prefill_endpoint": "http://127.0.0.1:8100",
+      "prefill_size": 8192,
+      "tpot_p50_clean_s": 0.00617263052198622,
+      "tpot_p50_overlap_s": 0.008303543533941712,
+      "tpot_p90_clean_s": 0.007060385713673601,
+      "tpot_p90_overlap_s": 0.013387419479061859,
+      "tpot_p99_clean_s": 0.0076809098022152696,
+      "tpot_p99_overlap_s": 0.013849472662415166,
+      "ttft_p90_clean_s": 0.04307150840759277,
+      "ttft_p90_overlap_s": 0.52073073387146,
+      "variant": "same"
+    }
+  ]
+}
\ No newline at end of file
diff --git a/analysis/characterization/window_1_results/b3_policy_comparison.json b/analysis/characterization/window_1_results/b3_policy_comparison.json
new file mode 100644
index 0000000..4646363
--- /dev/null
+++ b/analysis/characterization/window_1_results/b3_policy_comparison.json
@@ -0,0 +1,133 @@
+{
+  "rows": [
+    {
+      "policy": "capped",
+      "n_ok": 770,
+      "n_total": 770,
+      "ttft_p50_s": 1.195636051998008,
+      "ttft_p90_s": 12.762421467981767,
+      "ttft_p99_s": 46.05476947501302,
+      "tpot_p50_s": 0.007229394937166944,
+      "tpot_p90_s": 0.015995440982929352,
+      "tpot_p99_s": 0.10145225453431651,
+      "e2e_p50_s": 2.5921602529706433,
+      "e2e_p90_s": 21.238469071977306,
+      "e2e_p99_s": 73.38509433099534,
+      "apc_ratio": 0.3158312503528108,
+      "interference_index": 6.331064378362814,
+      "hotspot_index_ttft_p90": 1.9366915542605314,
+      "reuse_intra_frac": 0.9192657105586233,
+      "reuse_cross_frac": 0.0602232594931501,
+      "n_slow": 185,
+      "failure_counts": {
+        "cache_miss_large_append": 60,
+        "hot_worker_queue": 66,
+        "same_worker_prefill_overlap": 45,
+        "unknown": 14
+      }
+    },
+    {
+      "policy": "lmetric",
+      "n_ok": 1214,
+      "n_total": 1214,
+      "ttft_p50_s": 0.9369571270071901,
+      "ttft_p90_s": 15.592678204004187,
+      "ttft_p99_s": 52.95170431700535,
+      "tpot_p50_s": 0.008851506907892485,
+      "tpot_p90_s": 0.02120516549011311,
+      "tpot_p99_s": 0.17592118933357093,
+      "e2e_p50_s": 2.7527842019917443,
+      "e2e_p90_s": 24.75416105298791,
+      "e2e_p99_s": 79.61890332301846,
+      "apc_ratio": 0.5694312382571595,
+      "interference_index": 6.530231061794441,
+      "hotspot_index_ttft_p90": 2.237981740718548,
+      "reuse_intra_frac": 0.9321238805590836,
+      "reuse_cross_frac": 0.05679481258506571,
+      "n_slow": 295,
+      "failure_counts": {
+        "cache_miss_large_append": 94,
+        "hot_worker_queue": 68,
+        "same_worker_prefill_overlap": 69,
+        "unknown": 64
+      }
+    },
+    {
+      "policy": "load_only",
+      "n_ok": 1214,
+      "n_total": 1214,
+      "ttft_p50_s": 1.2542553890380077,
+      "ttft_p90_s": 20.14692750602262,
+      "ttft_p99_s": 52.64810254302574,
+      "tpot_p50_s": 0.00923045912795929,
+      "tpot_p90_s": 0.02672785480314115,
+      "tpot_p99_s": 0.3207044094773148,
+      "e2e_p50_s": 3.584156609023921,
+      "e2e_p90_s": 33.42658680601744,
+      "e2e_p99_s": 93.91839688795153,
+      "apc_ratio": 0.5412093853102866,
+      "interference_index": 9.16424627504275,
+      "hotspot_index_ttft_p90": 1.1400531308102801,
+      "reuse_intra_frac": 0.9353191550754928,
+      "reuse_cross_frac": 0.053372184678592026,
+      "n_slow": 379,
+      "failure_counts": {
+        "cache_miss_large_append": 151,
+        "hot_worker_queue": 33,
+        "same_worker_prefill_overlap": 108,
+        "unknown": 87
+      }
+    },
+    {
+      "policy": "sticky",
+      "n_ok": 1214,
+      "n_total": 1214,
+      "ttft_p50_s": 0.540947844972834,
+      "ttft_p90_s": 18.016640832996927,
+      "ttft_p99_s": 71.37327494798228,
+      "tpot_p50_s": 0.00894752275507555,
+      "tpot_p90_s": 0.0360956137329512,
+      "tpot_p99_s": 0.34523129428917954,
+      "e2e_p50_s": 2.0788628259906545,
+      "e2e_p90_s": 34.605129147996195,
+      "e2e_p99_s": 133.5824547969969,
+      "apc_ratio": 0.7720092868396378,
+      "interference_index": 13.651718321568111,
+      "hotspot_index_ttft_p90": 2.3493858974059214,
+      "reuse_intra_frac": 0.9327723488279339,
+      "reuse_cross_frac": 0.05495149683864246,
+      "n_slow": 234,
+      "failure_counts": {
+        "cache_miss_large_append": 20,
+        "hot_worker_queue": 51,
+        "same_worker_prefill_overlap": 134,
+        "unknown": 29
+      }
+    },
+    {
+      "policy": "unified",
+      "n_ok": 1213,
+      "n_total": 1214,
+      "ttft_p50_s": 0.4997710260213353,
+      "ttft_p90_s": 7.239999514014926,
+      "ttft_p99_s": 42.022206099005416,
+      "tpot_p50_s": 0.008079791456705824,
+      "tpot_p90_s": 0.017107906969874808,
+      "tpot_p99_s": 0.11808861252148231,
+      "e2e_p50_s": 1.7495028690318577,
+      "e2e_p90_s": 17.893827292020433,
+      "e2e_p99_s": 68.18008507299237,
+      "apc_ratio": 0.794261466256467,
+      "interference_index": null,
+      "hotspot_index_ttft_p90": 3.3497107140827365,
+      "reuse_intra_frac": 0.9311187350942534,
+      "reuse_cross_frac": 0.056702150437367635,
+      "n_slow": 189,
+      "failure_counts": {
+        "cache_miss_large_append": 18,
+        "hot_worker_queue": 116,
+        "unknown": 55
+      }
+    }
+  ]
+}
\ No newline at end of file
diff --git a/analysis/characterization/window_1_results/b3_report.md b/analysis/characterization/window_1_results/b3_report.md
new file mode 100644
index 0000000..276ce7b
--- /dev/null
+++ b/analysis/characterization/window_1_results/b3_report.md
@@ -0,0 +1,114 @@
+# B3 Routing Sweep Report
+
+Sweep dir: `b3_sweep_20260525_095043`
+Trace: w600_r0.0015_st30.jsonl (~1.2k reqs, 8 × TP1)
+Policies present: lmetric, load_only, sticky, unified, capped
+Policies pending: —
+
+## Headline latencies + APC
+
+| policy | ok/total | TTFT p50/p90/p99 (s) | TPOT p50/p90/p99 (ms) | E2E p50/p90/p99 (s) | APC |
+|---|---:|---|---|---|---:|
+| **lmetric** | 1214/1214 | 0.94/15.59/52.95 | 8.9/21.2/175.9 | 2.75/24.75/79.62 | 56.9% |
+| **load_only** | 1214/1214 | 1.25/20.15/52.65 | 9.2/26.7/320.7 | 3.58/33.43/93.92 | 54.1% |
+| **sticky** | 1214/1214 | 0.54/18.02/71.37 | 8.9/36.1/345.2 | 2.08/34.61/133.58 | 77.2% |
+| **unified** | 1213/1214 | 0.50/7.24/42.02 | 8.1/17.1/118.1 | 1.75/17.89/68.18 | 79.4% |
+| **capped** | 770/770 | 1.20/12.76/46.05 | 7.2/16.0/101.5 | 2.59/21.24/73.39 | 31.6% |
+
+## Mechanism indices
+
+| policy | interference_index | hotspot_index (TTFT p90) | intra-session reuse | cross-session reuse | n_slow |
+|---|---:|---:|---:|---:|---:|
+| **lmetric** | 6.53 | 2.24 | 93.2% | 5.7% | 295 |
+| **load_only** | 9.16 | 1.14 | 93.5% | 5.3% | 379 |
+| **sticky** | 13.65 | 2.35 | 93.3% | 5.5% | 234 |
+| **unified** | — | 3.35 | 93.1% | 5.7% | 189 |
+| **capped** | 6.33 | 1.94 | 91.9% | 6.0% | 185 |
+
+- **interference_index** = TPOT_p90(decode overlapping same-worker prefill) / TPOT_p90(clean)
+- **hotspot_index** = max(worker TTFT_p90) / median(worker TTFT_p90)
+
+## Slow-request cause breakdown
+
+| policy | n_slow | same-worker overlap | hot worker queue | cache miss large append | high KV | unknown |
+|---|---:|---:|---:|---:|---:|---:|
+| **lmetric** | 295 | 69 | 68 | 94 | 0 | 64 |
+| **load_only** | 379 | 108 | 33 | 151 | 0 | 87 |
+| **sticky** | 234 | 134 | 51 | 20 | 0 | 29 |
+| **unified** | 189 | 0 | 116 | 18 | 0 | 55 |
+| **capped** | 185 | 45 | 66 | 60 | 0 | 14 |
+
+## Policy notes
+
+- **lmetric** — cache-aware P_tokens × BS (main baseline)
+- **load_only** — control: min(num_requests), no cache, no affinity
+- **sticky** — control: hard session affinity (never break)
+- **unified** — hybrid affinity + LMetric fallback
+- **capped** — lmetric on per-session turn-capped trace
+
+## Per-policy per-worker TTFT p90 (s)
+
+### lmetric
+
+| worker | TTFT p90 (s) |
+|---|---:|
+| http://127.0.0.1:8000 | 28.18 |
+| http://127.0.0.1:8001 | 13.15 |
+| http://127.0.0.1:8002 | 13.82 |
+| http://127.0.0.1:8003 | 14.00 |
+| http://127.0.0.1:8004 | 31.34 |
+| http://127.0.0.1:8005 | 7.87 |
+| http://127.0.0.1:8006 | 14.15 |
+| http://127.0.0.1:8007 | 11.78 |
+
+### load_only
+
+| worker | TTFT p90 (s) |
+|---|---:|
+| http://127.0.0.1:8000 | 22.06 |
+| http://127.0.0.1:8001 | 16.43 |
+| http://127.0.0.1:8002 | 16.81 |
+| http://127.0.0.1:8003 | 23.58 |
+| http://127.0.0.1:8004 | 25.14 |
+| http://127.0.0.1:8005 | 16.08 |
+| http://127.0.0.1:8006 | 23.96 |
+| http://127.0.0.1:8007 | 13.95 |
+
+### sticky
+
+| worker | TTFT p90 (s) |
+|---|---:|
+| http://127.0.0.1:8000 | 12.28 |
+| http://127.0.0.1:8001 | 23.57 |
+| http://127.0.0.1:8002 | 5.20 |
+| http://127.0.0.1:8003 | 55.38 |
+| http://127.0.0.1:8004 | 17.03 |
+| http://127.0.0.1:8005 | 25.49 |
+| http://127.0.0.1:8006 | 36.31 |
+| http://127.0.0.1:8007 | 2.50 |
+
+### unified
+
+| worker | TTFT p90 (s) |
+|---|---:|
+| http://127.0.0.1:8000 | 11.26 |
+| http://127.0.0.1:8001 | 3.61 |
+| http://127.0.0.1:8002 | 16.18 |
+| http://127.0.0.1:8003 | 9.31 |
+| http://127.0.0.1:8004 | 37.73 |
+| http://127.0.0.1:8005 | 18.33 |
+| http://127.0.0.1:8006 | 3.63 |
+| http://127.0.0.1:8007 | 7.77 |
+
+### capped
+
+| worker | TTFT p90 (s) |
+|---|---:|
+| http://127.0.0.1:8000 | 19.77 |
+| http://127.0.0.1:8001 | 15.79 |
+| http://127.0.0.1:8002 | 20.40 |
+| http://127.0.0.1:8003 | 10.54 |
+| http://127.0.0.1:8004 | 9.52 |
+| http://127.0.0.1:8005 | 9.46 |
+| http://127.0.0.1:8006 | 7.38 |
+| http://127.0.0.1:8007 | 9.66 |
diff --git a/analysis/characterization/window_1_results/figures/fig_b2_tpot_vs_prefill.png b/analysis/characterization/window_1_results/figures/fig_b2_tpot_vs_prefill.png
new file mode 100644
index 0000000..a4bcff9
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_b2_tpot_vs_prefill.png differ
diff --git a/analysis/characterization/window_1_results/figures/fig_b2_ttft_vs_prefill.png b/analysis/characterization/window_1_results/figures/fig_b2_ttft_vs_prefill.png
new file mode 100644
index 0000000..15f3497
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_b2_ttft_vs_prefill.png differ
diff --git a/analysis/characterization/window_1_results/figures/fig_b3_apc_vs_hotspot.png b/analysis/characterization/window_1_results/figures/fig_b3_apc_vs_hotspot.png
new file mode 100644
index 0000000..166a94e
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_b3_apc_vs_hotspot.png differ
diff --git a/analysis/characterization/window_1_results/figures/fig_b3_apc_vs_upper.png b/analysis/characterization/window_1_results/figures/fig_b3_apc_vs_upper.png
new file mode 100644
index 0000000..759d965
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_b3_apc_vs_upper.png differ
diff --git a/analysis/characterization/window_1_results/figures/fig_b3_failure_breakdown.png b/analysis/characterization/window_1_results/figures/fig_b3_failure_breakdown.png
new file mode 100644
index 0000000..de90d42
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_b3_failure_breakdown.png differ
diff --git a/analysis/characterization/window_1_results/figures/fig_b3_latency_bars.png b/analysis/characterization/window_1_results/figures/fig_b3_latency_bars.png
new file mode 100644
index 0000000..df5afe4
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_b3_latency_bars.png differ
diff --git a/analysis/characterization/window_1_results/figures/fig_b3_per_worker_ttft_p90.png b/analysis/characterization/window_1_results/figures/fig_b3_per_worker_ttft_p90.png
new file mode 100644
index 0000000..743c484
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_b3_per_worker_ttft_p90.png differ
diff --git a/analysis/characterization/window_1_results/figures/fig_kv_footprint_cdf.png b/analysis/characterization/window_1_results/figures/fig_kv_footprint_cdf.png
new file mode 100644
index 0000000..63ac975
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_kv_footprint_cdf.png differ
diff --git a/analysis/characterization/window_1_results/figures/fig_reuse_decomposition.png b/analysis/characterization/window_1_results/figures/fig_reuse_decomposition.png
new file mode 100644
index 0000000..9a628f6
Binary files /dev/null and b/analysis/characterization/window_1_results/figures/fig_reuse_decomposition.png differ
diff --git a/analysis/characterization/window_1_results/kv_footprint_summary.json b/analysis/characterization/window_1_results/kv_footprint_summary.json
new file mode 100644
index 0000000..6f00a36
--- /dev/null
+++ b/analysis/characterization/window_1_results/kv_footprint_summary.json
@@ -0,0 +1,26 @@
+{
+  "formula": "kv_bytes_per_request = input_tokens * kv_bytes_per_token",
+  "kv_bytes_per_request": {
+    "count": 2114220,
+    "max": 19893878784.0,
+    "mean": 3306689367.3278427,
+    "min": 0.0,
+    "p50": 1969029120.0,
+    "p90": 8636507750.40001,
+    "p95": 10296164352.0,
+    "p99": 12339806208.0
+  },
+  "kv_bytes_per_token": 98304.0,
+  "kv_mib_per_request": {
+    "count": 2114220,
+    "max": 18972.28125,
+    "mean": 3153.5047219541957,
+    "min": 0.0,
+    "p50": 1877.8125,
+    "p90": 8236.415625000009,
+    "p95": 9819.1875,
+    "p99": 11768.15625
+  },
+  "status": "available",
+  "total_kv_gib": 6510940.188720703
+}
diff --git a/analysis/characterization/window_1_results/lmetric_hotspot.json b/analysis/characterization/window_1_results/lmetric_hotspot.json
new file mode 100644
index 0000000..03ac5fb
--- /dev/null
+++ b/analysis/characterization/window_1_results/lmetric_hotspot.json
@@ -0,0 +1,24 @@
+{
+  "hotspot_index_ttft_p90": 2.237981740718548,
+  "per_worker_latency_p90_s": {
+    "http://127.0.0.1:8000": 34.71445541951107,
+    "http://127.0.0.1:8001": 21.922988962882666,
+    "http://127.0.0.1:8002": 23.936190764518685,
+    "http://127.0.0.1:8003": 26.22220957049285,
+    "http://127.0.0.1:8004": 40.318757307820505,
+    "http://127.0.0.1:8005": 12.26559703698149,
+    "http://127.0.0.1:8006": 27.904838753980588,
+    "http://127.0.0.1:8007": 18.430557113309625
+  },
+  "per_worker_ttft_p90_s": {
+    "http://127.0.0.1:8000": 28.18261351052206,
+    "http://127.0.0.1:8001": 13.147308969072796,
+    "http://127.0.0.1:8002": 13.818959677941162,
+    "http://127.0.0.1:8003": 14.003642184572524,
+    "http://127.0.0.1:8004": 31.339895512629305,
+    "http://127.0.0.1:8005": 7.870992770011071,
+    "http://127.0.0.1:8006": 14.149156623415186,
+    "http://127.0.0.1:8007": 11.777357225219024
+  },
+  "status": "supported"
+}
diff --git a/analysis/characterization/window_1_results/lmetric_reuse.json b/analysis/characterization/window_1_results/lmetric_reuse.json
new file mode 100644
index 0000000..44f208d
--- /dev/null
+++ b/analysis/characterization/window_1_results/lmetric_reuse.json
@@ -0,0 +1,15 @@
+{
+  "cross_session_tokens": 1723017,
+  "fractions": {
+    "cross": 0.05679481258506571,
+    "intra": 0.9321238805590836,
+    "shared": 0.011081306855850749,
+    "unclassified": 0.0
+  },
+  "intra_session_tokens": 28278380,
+  "shared_prefix_min_sessions": 8,
+  "shared_prefix_tokens": 336180,
+  "status": "supported",
+  "total_cached_tokens": 30371008,
+  "unclassified_tokens": 0
+}
diff --git a/analysis/characterization/window_1_results/per_worker_capped.json b/analysis/characterization/window_1_results/per_worker_capped.json
new file mode 100644
index 0000000..a025d3c
--- /dev/null
+++ b/analysis/characterization/window_1_results/per_worker_capped.json
@@ -0,0 +1,24 @@
+{
+  "hotspot_index_ttft_p90": 1.9366915542605314,
+  "per_worker_latency_p90_s": {
+    "http://127.0.0.1:8000": 23.81083881931848,
+    "http://127.0.0.1:8001": 18.139674991380897,
+    "http://127.0.0.1:8002": 29.116712999995805,
+    "http://127.0.0.1:8003": 19.245074290811324,
+    "http://127.0.0.1:8004": 17.230851700413044,
+    "http://127.0.0.1:8005": 15.86663371440958,
+    "http://127.0.0.1:8006": 16.707309890014592,
+    "http://127.0.0.1:8007": 23.93718611740042
+  },
+  "per_worker_ttft_p90_s": {
+    "http://127.0.0.1:8000": 19.772570010094213,
+    "http://127.0.0.1:8001": 15.786850639013576,
+    "http://127.0.0.1:8002": 20.403525242628533,
+    "http://127.0.0.1:8003": 10.535247699997853,
+    "http://127.0.0.1:8004": 9.52290979558602,
+    "http://127.0.0.1:8005": 9.455131393985376,
+    "http://127.0.0.1:8006": 7.379608143202497,
+    "http://127.0.0.1:8007": 9.661995008389932
+  },
+  "status": "supported"
+}
diff --git a/analysis/characterization/window_1_results/per_worker_lmetric.json b/analysis/characterization/window_1_results/per_worker_lmetric.json
new file mode 100644
index 0000000..03ac5fb
--- /dev/null
+++ b/analysis/characterization/window_1_results/per_worker_lmetric.json
@@ -0,0 +1,24 @@
+{
+  "hotspot_index_ttft_p90": 2.237981740718548,
+  "per_worker_latency_p90_s": {
+    "http://127.0.0.1:8000": 34.71445541951107,
+    "http://127.0.0.1:8001": 21.922988962882666,
+    "http://127.0.0.1:8002": 23.936190764518685,
+    "http://127.0.0.1:8003": 26.22220957049285,
+    "http://127.0.0.1:8004": 40.318757307820505,
+    "http://127.0.0.1:8005": 12.26559703698149,
+    "http://127.0.0.1:8006": 27.904838753980588,
+    "http://127.0.0.1:8007": 18.430557113309625
+  },
+  "per_worker_ttft_p90_s": {
+    "http://127.0.0.1:8000": 28.18261351052206,
+    "http://127.0.0.1:8001": 13.147308969072796,
+    "http://127.0.0.1:8002": 13.818959677941162,
+    "http://127.0.0.1:8003": 14.003642184572524,
+    "http://127.0.0.1:8004": 31.339895512629305,
+    "http://127.0.0.1:8005": 7.870992770011071,
+    "http://127.0.0.1:8006": 14.149156623415186,
+    "http://127.0.0.1:8007": 11.777357225219024
+  },
+  "status": "supported"
+}
diff --git a/analysis/characterization/window_1_results/per_worker_load_only.json b/analysis/characterization/window_1_results/per_worker_load_only.json
new file mode 100644
index 0000000..32ef216
--- /dev/null
+++ b/analysis/characterization/window_1_results/per_worker_load_only.json
@@ -0,0 +1,24 @@
+{
+  "hotspot_index_ttft_p90": 1.1400531308102801,
+  "per_worker_latency_p90_s": {
+    "http://127.0.0.1:8000": 33.51168999259829,
+    "http://127.0.0.1:8001": 29.20308109278556,
+    "http://127.0.0.1:8002": 27.126518827211115,
+    "http://127.0.0.1:8003": 38.597240307606995,
+    "http://127.0.0.1:8004": 36.607777832809376,
+    "http://127.0.0.1:8005": 28.097025175404276,
+    "http://127.0.0.1:8006": 49.29610514297965,
+    "http://127.0.0.1:8007": 20.958507975534303
+  },
+  "per_worker_ttft_p90_s": {
+    "http://127.0.0.1:8000": 22.055091864388675,
+    "http://127.0.0.1:8001": 16.425856862741057,
+    "http://127.0.0.1:8002": 16.806352904380766,
+    "http://127.0.0.1:8003": 23.581166115606912,
+    "http://127.0.0.1:8004": 25.14397653030465,
+    "http://127.0.0.1:8005": 16.080231266201018,
+    "http://127.0.0.1:8006": 23.960470345703648,
+    "http://127.0.0.1:8007": 13.95184187250561
+  },
+  "status": "supported"
+}
diff --git a/analysis/characterization/window_1_results/per_worker_sticky.json b/analysis/characterization/window_1_results/per_worker_sticky.json
new file mode 100644
index 0000000..ae978de
--- /dev/null
+++ b/analysis/characterization/window_1_results/per_worker_sticky.json
@@ -0,0 +1,24 @@
+{
+  "hotspot_index_ttft_p90": 2.3493858974059214,
+  "per_worker_latency_p90_s": {
+    "http://127.0.0.1:8000": 30.185792533413043,
+    "http://127.0.0.1:8001": 47.49661003401852,
+    "http://127.0.0.1:8002": 22.069474861002554,
+    "http://127.0.0.1:8003": 83.73774532350944,
+    "http://127.0.0.1:8004": 22.03310715127737,
+    "http://127.0.0.1:8005": 33.024566102202556,
+    "http://127.0.0.1:8006": 61.65600914339302,
+    "http://127.0.0.1:8007": 6.077459598158019
+  },
+  "per_worker_ttft_p90_s": {
+    "http://127.0.0.1:8000": 12.284569517592924,
+    "http://127.0.0.1:8001": 23.570226482005094,
+    "http://127.0.0.1:8002": 5.202772857400123,
+    "http://127.0.0.1:8003": 55.37555769548635,
+    "http://127.0.0.1:8004": 17.031311958114394,
+    "http://127.0.0.1:8005": 25.48531596700202,
+    "http://127.0.0.1:8006": 36.31029207323453,
+    "http://127.0.0.1:8007": 2.4984901855932535
+  },
+  "status": "supported"
+}
diff --git a/analysis/characterization/window_1_results/per_worker_unified.json b/analysis/characterization/window_1_results/per_worker_unified.json
new file mode 100644
index 0000000..5311e6d
--- /dev/null
+++ b/analysis/characterization/window_1_results/per_worker_unified.json
@@ -0,0 +1,24 @@
+{
+  "hotspot_index_ttft_p90": 3.3497107140827365,
+  "per_worker_latency_p90_s": {
+    "http://127.0.0.1:8000": 41.42001512600109,
+    "http://127.0.0.1:8001": 12.4878579101933,
+    "http://127.0.0.1:8002": 22.462878945574648,
+    "http://127.0.0.1:8003": 15.501050900109117,
+    "http://127.0.0.1:8004": 39.956250199786155,
+    "http://127.0.0.1:8005": 36.69850301651168,
+    "http://127.0.0.1:8006": 10.116177947795954,
+    "http://127.0.0.1:8007": 20.35038618039107
+  },
+  "per_worker_ttft_p90_s": {
+    "http://127.0.0.1:8000": 11.264844838529825,
+    "http://127.0.0.1:8001": 3.6063860427122614,
+    "http://127.0.0.1:8002": 16.175747957825664,
+    "http://127.0.0.1:8003": 9.314684258581842,
+    "http://127.0.0.1:8004": 37.73397144810297,
+    "http://127.0.0.1:8005": 18.328030522551852,
+    "http://127.0.0.1:8006": 3.6328767628350773,
+    "http://127.0.0.1:8007": 7.772977900883419
+  },
+  "status": "supported"
+}
diff --git a/analysis/characterization/window_1_results/summary.json b/analysis/characterization/window_1_results/summary.json
new file mode 100644
index 0000000..760867e
--- /dev/null
+++ b/analysis/characterization/window_1_results/summary.json
@@ -0,0 +1,136 @@
+{
+  "analyzed_records": 2114220,
+  "batch0": {
+    "attempted_requests": 2114220,
+    "completed_requests": null,
+    "error_requests": null,
+    "max_inflight_per_session": null,
+    "session_concurrency_status": "unavailable",
+    "session_sequential": null
+  },
+  "batch1": {
+    "append_status": "unavailable",
+    "input_stats": {
+      "count": 2114220,
+      "max": 202371.0,
+      "mean": 33637.38370084476,
+      "min": 0.0,
+      "p50": 20030.0,
+      "p90": 87855.1000000001,
+      "p95": 104738.0,
+      "p99": 125527.0
+    },
+    "kv_footprint_status": "available",
+    "output_stats": {
+      "count": 2114220,
+      "max": 132665.0,
+      "mean": 444.97059624826176,
+      "min": 0.0,
+      "p50": 80.0,
+      "p90": 811.0,
+      "p95": 2213.0,
+      "p99": 6614.810000000056
+    },
+    "reuse_status": "unavailable"
+  },
+  "classification": {
+    "label": "invalid_for_online_claim",
+    "reason": "actual dispatch/finish timestamps are unavailable, so online sequentiality cannot be proven",
+    "source": "auto",
+    "stress_indicators": []
+  },
+  "manifest": {
+    "canonical_trace_data_sources": {
+      "dash0_formatted_trace_dir": "~/ali-trace/trace-glm5.1-formatted/",
+      "dash0_raw_trace_dir": "~/ali-trace/trace-glm5.1/",
+      "usage_note": "Full trace analysis can be run CPU-only on dash0, or the needed JSONL files can be copied/rsynced from dash0 to this machine before running this analyzer."
+    },
+    "end_time": "2026-05-25T09:03:36.499002+00:00",
+    "figure_status": {
+      "reason": "matplotlib unavailable: ModuleNotFoundError(\"No module named 'matplotlib'\")",
+      "status": "skipped"
+    },
+    "git_commit": "",
+    "gpu_count": 0,
+    "gpu_type": "",
+    "host": "ds-6348bee4-1-765874c9c4-7zrvf",
+    "input_requirements": {
+      "actual_sequentiality_proof": [
+        "per-request dispatch timestamp",
+        "per-request finish or error/timeout timestamp",
+        "request_id join to trace/metrics when timing source is separate"
+      ],
+      "metrics_jsonl": [
+        "request_id",
+        "session_id",
+        "trace_timestamp_s",
+        "input_length",
+        "output_length",
+        "latency_s",
+        "ttft_s",
+        "tpot_s",
+        "error",
+        "optional cached_tokens"
+      ],
+      "reuse_decomposition": [
+        "cached_tokens or cache_hit",
+        "hash_ids",
+        "session_id"
+      ],
+      "trace_jsonl": [
+        "chat_id",
+        "parent_chat_id",
+        "timestamp",
+        "input_length",
+        "output_length",
+        "turn",
+        "hash_ids",
+        "optional session_id"
+      ]
+    },
+    "input_status": {
+      "analyzed_records": 2114220,
+      "breakdown_records": 0,
+      "merge_warnings": [],
+      "metrics_records": 0,
+      "trace_records": 2114220,
+      "trace_warnings": [],
+      "unmatched_breakdown": 0,
+      "unmatched_metrics": 0
+    },
+    "launch_command": "analysis/characterization/analyze.py --trace /home/admin/cpfs/wjh/ali-trace/trace-glm5.1-formatted/051315-051317.jsonl --kv-bytes-per-token 98304 --task-name full_trace_with_kv --output-root outputs/characterization --overwrite",
+    "output_dir": "outputs/characterization/2026-05-25/full_trace_with_kv",
+    "policy": "",
+    "request_limit": null,
+    "session_sampling_method": "",
+    "session_sequential": null,
+    "start_time": "2026-05-25T08:59:11.618919+00:00",
+    "time_scale": null,
+    "trace_file_info": {
+      "exists": true,
+      "mtime_s": 1778772033.2788928,
+      "path": "/home/admin/cpfs/wjh/ali-trace/trace-glm5.1-formatted/051315-051317.jsonl",
+      "sha256": "",
+      "sha256_status": "skipped_use_--hash-inputs",
+      "size_bytes": 1561266372
+    },
+    "trace_path": "/home/admin/cpfs/wjh/ali-trace/trace-glm5.1-formatted/051315-051317.jsonl",
+    "trace_sha256": ""
+  },
+  "outputs": [
+    "append_delta_stats.json",
+    "invalid_runs.md",
+    "kv_footprint_summary.json",
+    "manifest.json",
+    "raw/merged_requests.jsonl",
+    "raw/unmatched_breakdown.jsonl",
+    "raw/unmatched_metrics.jsonl",
+    "reuse_decomposition.json",
+    "session_arrival_stats.json",
+    "session_concurrency.json",
+    "session_skew.json",
+    "trace_profile.json",
+    "turn_interval_stats.json",
+    "workload_summary.json"
+  ]
+}