Gahow Wang a66f24d242 MB5 aggregate: cross-config KV-pool + latency comparison
Reads sweep root + tag, for each (config, rep):
- merges per-PID snapshots into cluster-wide KV timeline (carry-forward
  for PIDs without a sample in the bin)
- computes peak (max) and steady-state (10-90% median) pool utilization
- pulls latency p50/p90/p99 from replay_metrics.summary.json

Produces 4 outputs in --out-dir:
- mb5_kv_timeline.png    — N-panel cluster KV % over time, one panel per
                            config, faint per-rep lines + bold median
- mb5_peak_utilization.png — bar chart (peak vs steady) with ±std error bars
- mb5_latency_compare.png  — bar chart p50/p90/p99 e2e latency per config
- mb5_summary.csv          — flat per-(config, rep) table for the writeup

Validated on 4P+4D × 20-req smoke:
  4P+4D rep1: peak=12.8% steady=10.7% peak_wait=1
  p50=1.3s p90=10.5s p99=17.1s (vs. <1s for 8C — expected gap).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 00:49:21 +08:00
Description
No description provided
48 MiB
Languages
Python 82.9%
Shell 17.1%