Files
agentic-kvc/analysis/characterization/window_1_results/per_worker_sticky.json
Gahow Wang 0c3220cbb8 Window 1 results: combined B1' + B2 + B3 report and artifacts
analysis/characterization/window_1_results.md is the headline write-up
for Window 1: workload characterization (KV per request, real reuse
decomposition, APC theoretical ceilings), B3 5-policy sweep with
per-policy interpretation, B2 same-vs-different-worker interference
microbench with causal reading, and an explicit list of what Window 1
does *not* answer (deferred to B4 SRR sweep + B5 attribution).

Under window_1_results/:
- 5 raw result JSONs from the B3 sweep, the B2 microbench, the APC
  upper bound, and the KV footprint
- per-policy hotspot_index.json snapshots so render_window1_figures.py
  can plot per-worker TTFT p90 distributions
- 8 PNG figures (figures/) covering the headline claims

Three takeaways the figures pin down:
1) intra-session reuse dominates (93.2%), so session-affinity routing
   is the right primary lever
2) unified hybrid affinity hits 79.4% APC (97% of the 79.6% intra-
   session ceiling) AND cuts TTFT p90 from lmetric's 15.6s to 7.24s
3) B2 different-worker control sits at idx ≈ 1.0 across 32× prefill-
   size variation; same-worker TTFT idx scales 2.15× -> 218×, which
   is the cleanest causal evidence for same-worker prefill-decode
   interference

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 23:25:09 +08:00

25 lines
920 B
JSON

{
"hotspot_index_ttft_p90": 2.3493858974059214,
"per_worker_latency_p90_s": {
"http://127.0.0.1:8000": 30.185792533413043,
"http://127.0.0.1:8001": 47.49661003401852,
"http://127.0.0.1:8002": 22.069474861002554,
"http://127.0.0.1:8003": 83.73774532350944,
"http://127.0.0.1:8004": 22.03310715127737,
"http://127.0.0.1:8005": 33.024566102202556,
"http://127.0.0.1:8006": 61.65600914339302,
"http://127.0.0.1:8007": 6.077459598158019
},
"per_worker_ttft_p90_s": {
"http://127.0.0.1:8000": 12.284569517592924,
"http://127.0.0.1:8001": 23.570226482005094,
"http://127.0.0.1:8002": 5.202772857400123,
"http://127.0.0.1:8003": 55.37555769548635,
"http://127.0.0.1:8004": 17.031311958114394,
"http://127.0.0.1:8005": 25.48531596700202,
"http://127.0.0.1:8006": 36.31029207323453,
"http://127.0.0.1:8007": 2.4984901855932535
},
"status": "supported"
}