agentic-kvc

Files

Gahow Wang 0b180c191e v2 exp(d): expand figure to 6 panels (TTFT/E2E mean+p90, TPS, per-worker GPU util)

Per request: TTFT mean+p90, E2E mean+p90, decode TPS (output goodput; total/
prefill TPS omitted as cache-miss-inflated), and per-worker GPU-util boxplots
(8 workers/arm, tracets vs thinktime) showing utilization level + balance.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-05-30 21:10:27 +08:00

exp_a_tier_latency.png

v2 exp(a): add remote KV-store (RDMA) tier

2026-05-30 12:48:37 +08:00

exp_b_capacity_knee.png

v2 exp(b): GPU KV-capacity APC/latency knee + writeup

2026-05-30 11:23:31 +08:00

exp_c_dispatch_ablation.png

Replayer think-time dispatch mode + benchmarking guidance

2026-05-30 16:28:36 +08:00

exp_d_policy_dispatch.png

v2 exp(d): expand figure to 6 panels (TTFT/E2E mean+p90, TPS, per-worker GPU util)

2026-05-30 21:10:27 +08:00