Files
Gahow Wang fafc44da79 MB5 PD reuse-centric ablation: tooling, data, Fig 1-3
Three-axis controlled ablation of PD-colo vs PD-disagg on synthetic regular
traces (closed-loop, controlled reuse via REPLAY_NO_REALIZED_PREFIX) on the
clean stack (e13391e gated off).

  Axis 1 (Fig 1) -- reuse 6%->94% at N=8, in8192/out256
  Axis 2 (Fig 2) -- shape in2048/out2048 -> in32768/out64 at N=8, reuse~70%
  Axis 3 (Fig 3) -- concurrency N=8/16/32/64 at reuse~71%, in8192/out256

Findings:
  * APC parity colo=PD at every reuse (5.5/22/44/66/77/82%) -- contamination
    fix validated.
  * PD edge erodes 1.57x->1.10x with reuse; prefill GPUs strand 26%->9%.
  * Shape: PD-best peaks mid-sweep (1.34x at in8192/out512); wrong PD ratio
    catastrophic at prefill extreme (in32768/out64 pd2 = 378/400, p99 432s).
  * Concurrency: PD wins N<=32 (1.23-1.29x), TIPS at N=64 -- pd2/pd4
    crater (APC 71%->1.4%, TPS -30%) while colo scales cleanly.

Infrastructure:
  * replayer: --max-inflight-sessions, --inter-turn-think, --no-realized-prefix
    (env-defaulted via REPLAY_MAX_INFLIGHT, REPLAY_INTER_TURN_THINK_S,
    REPLAY_NO_REALIZED_PREFIX).
  * mb5_run.sh: writes bench_config.json + gpu_util.csv + run_window.json +
    instance_apc.txt + metrics.jsonl for bench_report/fig_agg ingest.
  * fig_agg.py: per-arm GPU role split + producer-side APC; --json mode.
  * gpu_util_report.py: companion per-GPU util report from gpu_util.csv.
  * partial_summary.py: stats from in-flight replay_metrics.jsonl
    (works before metrics.summary.json exists).

Data: analysis/mb5_pd_ablation/fig{1,2,3}.json (24 + 20 + 16 rows).
Figures: figs/mb5_pd_ablation/fig{1_reuse,2_shape,3_concurrency}_axis.png.
2026-05-31 20:14:46 +08:00

2 lines
7.7 KiB
JSON

[{"name": "fig2_in16384_out128_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 1.6966288615367375, "e2e_p90": 3.142477283347398, "e2e_p99": 4.572902428222587, "e2e_mean": 1.8778391962422756, "ttft_p90": 1.528641331603285, "tpot_p99": 0.02700975849941244, "tps": 293.2414474758892, "wall": 174.600147560006, "pu": 30.718373493975903, "du": null, "apc": 0.73828125}, {"name": "fig2_in16384_out128_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 1.5746862735250033, "e2e_p90": 3.6393908081925486, "e2e_p99": 6.788023261578052, "e2e_mean": 2.1054475268305395, "ttft_p90": 2.8525443844730045, "tpot_p99": 0.007377313970786145, "tps": 272.743216323279, "wall": 187.72235911199823, "pu": 54.79545454545455, "du": 28.009469696969695, "apc": 0.73828125}, {"name": "fig2_in16384_out128_pd4_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 1.2106705509941094, "e2e_p90": 2.6971542384708305, "e2e_p99": 4.516567796494346, "e2e_mean": 1.6196880877471995, "ttft_p90": 1.8512291587772782, "tpot_p99": 0.007638815456312003, "tps": 307.7022111731225, "wall": 166.3946443699533, "pu": 28.876582278481013, "du": 47.36708860759494, "apc": 0.73828125}, {"name": "fig2_in16384_out128_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 1.3666948495083489, "e2e_p90": 2.656380763812923, "e2e_p99": 4.434802388340466, "e2e_mean": 1.6502306728763505, "ttft_p90": 1.7600484249996953, "tpot_p99": 0.009977159781425488, "tps": 307.56190002160906, "wall": 166.47055437101517, "pu": 21.023206751054854, "du": 70.51898734177215, "apc": 0.73828125}, {"name": "fig2_in2048_out2048_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 11.900513574946672, "e2e_p90": 14.623661132121924, "e2e_p99": 17.82160759311984, "e2e_mean": 12.263538628305833, "ttft_p90": 0.13757785173365847, "tpot_p99": 0.00867108589104906, "tps": 1109.2196116287032, "wall": 738.5372485410189, "pu": 54.30869565217391, "du": null, "apc": 0.65625}, {"name": "fig2_in2048_out2048_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 11.239029973454308, "e2e_p90": 12.24954682419775, "e2e_p99": 12.908233385497004, "e2e_mean": 11.36166481389053, "ttft_p90": 0.1597270941361785, "tpot_p99": 0.006243306631126823, "tps": 1159.3604844966112, "wall": 706.5964477439411, "pu": 1.9437689969604863, "du": 86.7517730496454, "apc": 0.65625}, {"name": "fig2_in2048_out2048_pd4_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 12.676327208988369, "e2e_p90": 13.124083981337025, "e2e_p99": 13.789963249830762, "e2e_mean": 12.521095666602777, "ttft_p90": 0.1668232314521447, "tpot_p99": 0.006606968528777976, "tps": 1070.1894910008175, "wall": 765.4719158509979, "pu": 0.5945378151260504, "du": 92.65546218487395, "apc": 0.65625}, {"name": "fig2_in2048_out2048_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 15.628125407500193, "e2e_p90": 16.762494630913714, "e2e_p99": 17.865684803246978, "e2e_mean": 15.437463862727746, "ttft_p90": 0.1816938084899448, "tpot_p99": 0.008672833048181654, "tps": 897.4033352505149, "wall": 912.8559788239654, "pu": 0.2651869158878505, "du": 98.21028037383178, "apc": 0.65625}, {"name": "fig2_in32768_out64_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 2.8777761650271714, "e2e_p90": 7.02909248394426, "e2e_p99": 12.042338756883982, "e2e_mean": 3.6056005006073972, "ttft_p90": 4.589756254199893, "tpot_p99": 0.15461345151164715, "tps": 97.4559162735194, "wall": 262.6828722039936, "pu": 36.19410569105691, "du": null, "apc": 0.73828125}, {"name": "fig2_in32768_out64_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 378, "req": 400, "e2e_p50": 5.744399158516899, "e2e_p90": 17.501065154711252, "e2e_p99": 431.9109102533118, "e2e_mean": 24.76107206362763, "ttft_p90": 17.079777074372398, "tpot_p99": 0.008512455084701146, "tps": 17.72103702655267, "wall": 1365.1571273030713, "pu": 22.84921875, "du": 2.06796875, "apc": 0.8334464289939819}, {"name": "fig2_in32768_out64_pd4_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 2.331694360531401, "e2e_p90": 8.168041506421288, "e2e_p99": 16.819581468357928, "e2e_mean": 4.067478344673291, "ttft_p90": 7.7613852798473095, "tpot_p99": 0.008991237692276223, "tps": 89.86030789358054, "wall": 284.8866268109996, "pu": 53.20335820895522, "du": 15.065298507462687, "apc": 0.73828125}, {"name": "fig2_in32768_out64_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 1.881187686463818, "e2e_p90": 6.823026831133758, "e2e_p99": 12.242816790416828, "e2e_mean": 3.2513622655556538, "ttft_p90": 6.349652938055806, "tpot_p99": 0.011577233054050565, "tps": 105.74545516262978, "wall": 242.09078263107222, "pu": 42.801169590643276, "du": 31.153508771929825, "apc": 0.73828125}, {"name": "fig2_in4096_out1024_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 6.376699871034361, "e2e_p90": 8.016901113302447, "e2e_p99": 9.421493258888365, "e2e_mean": 6.472622742803069, "ttft_p90": 0.26107478952035307, "tpot_p99": 0.009009339244909244, "tps": 964.4334957573764, "wall": 424.7052822220139, "pu": 50.12248743718593, "du": null, "apc": 0.65625}, {"name": "fig2_in4096_out1024_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 5.711871185048949, "e2e_p90": 6.152766603662167, "e2e_p99": 6.618846287685439, "e2e_mean": 5.7896922694568635, "ttft_p90": 0.2993865112075582, "tpot_p99": 0.006226416155723225, "tps": 1026.860822805463, "wall": 398.88560445897747, "pu": 3.8877005347593583, "du": 83.79411764705883, "apc": 0.65625}, {"name": "fig2_in4096_out1024_pd4_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 6.441164412011858, "e2e_p90": 6.755943879298865, "e2e_p99": 7.246829778881511, "e2e_mean": 6.4186325767840025, "ttft_p90": 0.30361198947066437, "tpot_p99": 0.0066874305859860395, "tps": 948.5732059117771, "wall": 431.8064198390348, "pu": 2.6683168316831685, "du": 88.01608910891089, "apc": 0.65625}, {"name": "fig2_in4096_out1024_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 8.175307728059124, "e2e_p90": 8.772436089895201, "e2e_p99": 9.845743471009191, "e2e_mean": 8.103695073690615, "ttft_p90": 0.3135268738726154, "tpot_p99": 0.008783244535960586, "tps": 795.3463509805472, "wall": 514.9957619030029, "pu": 1.2988980716253444, "du": 95.7107438016529, "apc": 0.65625}, {"name": "fig2_in8192_out512_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 3.569815175491385, "e2e_p90": 4.748414856137243, "e2e_p99": 6.3728869484120505, "e2e_mean": 3.6905484462657476, "ttft_p90": 0.5787142073037103, "tpot_p99": 0.011623186658178922, "tps": 749.35951206451, "wall": 273.3000605220441, "pu": 43.21484375, "du": null, "apc": 0.7109375}, {"name": "fig2_in8192_out512_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 3.0584998495178297, "e2e_p90": 3.546729282848538, "e2e_p99": 4.885626904441742, "e2e_mean": 3.183082153094583, "ttft_p90": 0.6684098902973354, "tpot_p99": 0.006093405278323496, "tps": 801.0277344160907, "wall": 255.67154693999328, "pu": 14.795833333333333, "du": 70.95138888888889, "apc": 0.7109375}, {"name": "fig2_in8192_out512_pd4_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 3.3473425395786762, "e2e_p90": 3.8297921021352526, "e2e_p99": 4.728309926969231, "e2e_mean": 3.4304884171887533, "ttft_p90": 0.647590011463035, "tpot_p99": 0.0067240075080280985, "tps": 768.7152035389245, "wall": 266.41856315208133, "pu": 7.96, "du": 83.674, "apc": 0.7109375}, {"name": "fig2_in8192_out512_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 4.395662502502091, "e2e_p90": 4.981798351998441, "e2e_p99": 6.572449592349585, "e2e_mean": 4.434228266531718, "ttft_p90": 0.6629501176299528, "tpot_p99": 0.009418493171829412, "tps": 645.2526253784575, "wall": 317.3950665909797, "pu": 5.468680089485459, "du": 94.43959731543625, "apc": 0.7109375}]