The headline f6_e2e_latency_bars only shows p90, hiding three regimes:
- mean: unified dominates (3.3s TTFT, 7.0s E2E vs sticky 5.6s / 12.1s)
- p50: sticky and unified are tied on first-turn TTFT (0.5s each) —
sticky's first turn of each session is free, after which queues
accumulate. Unified beats sticky everywhere else.
- p99: tail amplification reveals unified's biggest gap —
TTFT 42.3s vs sticky 74.1s; E2E 68.8s vs sticky 139.7s.
The 12-panel figure is the honest full picture; the 3-panel headline
stays for slide-friendly summary.
- analysis/characterization/window_1_results/raw_stats/{policy}.json:
cached ttft/tpot/e2e {mean,p50,p90,p99} pulled from dash0
/home/admin/cpfs/wjh/agentic-kv/outputs/b3_sweep_20260525_095043/
(b3_policy_comparison.json doesn't record mean, only percentiles).
- analysis/characterization/render_window1_figures.py:
new fig_b3_latency_full_grid renders the 4×3 grid from the cache.
- figs/f6_e2e_latency_full_grid.png: 12-panel companion.
- PAPER_OUTLINE.md §5.2: both figures embedded; main table column
renamed from "Hotspot idx" to "Worker p90 (median / max)" to match
the new metric convention.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
24 lines
489 B
JSON
24 lines
489 B
JSON
{
|
|
"ttft": {
|
|
"count": 1214.0,
|
|
"mean": 5.55315460854824,
|
|
"p50": 0.5415176274836995,
|
|
"p90": 18.021296651283045,
|
|
"p99": 74.09429564891524
|
|
},
|
|
"tpot": {
|
|
"count": 1214.0,
|
|
"mean": 0.027834537397398284,
|
|
"p50": 0.008952101894096181,
|
|
"p90": 0.03641285916619554,
|
|
"p99": 0.35152006935195085
|
|
},
|
|
"e2e": {
|
|
"count": 1214.0,
|
|
"mean": 12.109200157184377,
|
|
"p50": 2.081947358994512,
|
|
"p90": 34.62592205510591,
|
|
"p99": 139.68334607904353
|
|
}
|
|
}
|