Three-axis controlled ablation of PD-colo vs PD-disagg on synthetic regular
traces (closed-loop, controlled reuse via REPLAY_NO_REALIZED_PREFIX) on the
clean stack (e13391e gated off).
Axis 1 (Fig 1) -- reuse 6%->94% at N=8, in8192/out256
Axis 2 (Fig 2) -- shape in2048/out2048 -> in32768/out64 at N=8, reuse~70%
Axis 3 (Fig 3) -- concurrency N=8/16/32/64 at reuse~71%, in8192/out256
Findings:
* APC parity colo=PD at every reuse (5.5/22/44/66/77/82%) -- contamination
fix validated.
* PD edge erodes 1.57x->1.10x with reuse; prefill GPUs strand 26%->9%.
* Shape: PD-best peaks mid-sweep (1.34x at in8192/out512); wrong PD ratio
catastrophic at prefill extreme (in32768/out64 pd2 = 378/400, p99 432s).
* Concurrency: PD wins N<=32 (1.23-1.29x), TIPS at N=64 -- pd2/pd4
crater (APC 71%->1.4%, TPS -30%) while colo scales cleanly.
Infrastructure:
* replayer: --max-inflight-sessions, --inter-turn-think, --no-realized-prefix
(env-defaulted via REPLAY_MAX_INFLIGHT, REPLAY_INTER_TURN_THINK_S,
REPLAY_NO_REALIZED_PREFIX).
* mb5_run.sh: writes bench_config.json + gpu_util.csv + run_window.json +
instance_apc.txt + metrics.jsonl for bench_report/fig_agg ingest.
* fig_agg.py: per-arm GPU role split + producer-side APC; --json mode.
* gpu_util_report.py: companion per-GPU util report from gpu_util.csv.
* partial_summary.py: stats from in-flight replay_metrics.jsonl
(works before metrics.summary.json exists).
Data: analysis/mb5_pd_ablation/fig{1,2,3}.json (24 + 20 + 16 rows).
Figures: figs/mb5_pd_ablation/fig{1_reuse,2_shape,3_concurrency}_axis.png.
2 lines
9.0 KiB
JSON
2 lines
9.0 KiB
JSON
[{"name": "fig1_p2048_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 2.555279068008531, "e2e_p90": 3.4275179531075994, "e2e_p99": 5.231042563370427, "e2e_mean": 2.533418247078953, "ttft_p90": 1.059612769272644, "tpot_p99": 0.016176813139529973, "tps": 488.9617809203525, "wall": 209.42332099506166, "pu": 35.58080808080808, "du": null, "apc": 0.21875}, {"name": "fig1_p2048_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 1.9831702220253646, "e2e_p90": 2.408317962428555, "e2e_p99": 3.386159433723659, "e2e_mean": 2.080101182157232, "ttft_p90": 1.0395420689717867, "tpot_p99": 0.005522104923522062, "tps": 530.468328273858, "wall": 193.0369723169133, "pu": 46.917582417582416, "du": 48.58058608058608, "apc": 0.21875}, {"name": "fig1_p2048_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 2.3523911059601232, "e2e_p90": 2.7588249894673935, "e2e_p99": 3.603395572100996, "e2e_mean": 2.4113844815955963, "ttft_p90": 0.7664745874935761, "tpot_p99": 0.009031482047424195, "tps": 488.72074894811755, "wall": 209.52660639106762, "pu": 14.218855218855218, "du": 90.16161616161617, "apc": 0.21875}, {"name": "fig1_p2048_pd_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 2.0429230239242315, "e2e_p90": 2.2362921022577216, "e2e_p99": 2.9135718233766945, "e2e_mean": 2.095764471256989, "ttft_p90": 0.7477957331226207, "tpot_p99": 0.006522443569635094, "tps": 527.6226407393885, "wall": 194.07810069806874, "pu": 23.669444444444444, "du": 65.01666666666667, "apc": 0.21875}, {"name": "fig1_p4096_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 2.2734450550633483, "e2e_p90": 3.0487391501781533, "e2e_p99": 4.6287568241392725, "e2e_mean": 2.2661249774988392, "ttft_p90": 0.713115519611165, "tpot_p99": 0.014206131751343207, "tps": 520.0122707724117, "wall": 196.9184301130008, "pu": 34.659946236559136, "du": null, "apc": 0.4375}, {"name": "fig1_p4096_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 1.8291561939986423, "e2e_p90": 2.2601341274916207, "e2e_p99": 3.2612337802827804, "e2e_mean": 1.9412393476464787, "ttft_p90": 0.8800801524659628, "tpot_p99": 0.005551423189517877, "tps": 552.1771045541858, "wall": 185.44774702796713, "pu": 38.293103448275865, "du": 49.05747126436781, "apc": 0.4375}, {"name": "fig1_p4096_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 2.2462158624548465, "e2e_p90": 2.64611944751814, "e2e_p99": 3.4558432800625423, "e2e_mean": 2.2963840230583448, "ttft_p90": 0.711689621617552, "tpot_p99": 0.008991341477657172, "tps": 502.68863490365385, "wall": 203.70462526893243, "pu": 10.604166666666666, "du": 88.75520833333333, "apc": 0.4375}, {"name": "fig1_p4096_pd_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 1.8967114535043947, "e2e_p90": 2.18965909028193, "e2e_p99": 2.9952263131842467, "e2e_mean": 1.9644193770724814, "ttft_p90": 0.7082089037983679, "tpot_p99": 0.006562968838594706, "tps": 548.1341659016695, "wall": 186.81557613098994, "pu": 19.133522727272727, "du": 67.32102272727273, "apc": 0.4375}, {"name": "fig1_p512_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 2.6645242294762284, "e2e_p90": 3.61182137549622, "e2e_p99": 5.3448568455432515, "e2e_mean": 2.6195515102380886, "ttft_p90": 1.112424884561915, "tpot_p99": 0.01658212741880276, "tps": 475.13983834945145, "wall": 215.5155003539985, "pu": 36.375, "du": null, "apc": 0.0546875}, {"name": "fig1_p512_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 2.103162212530151, "e2e_p90": 2.4044417485478347, "e2e_p99": 3.3844505867047685, "e2e_mean": 2.1739998702009324, "ttft_p90": 1.0342383790179164, "tpot_p99": 0.00550937183462905, "tps": 517.9536953932844, "wall": 197.7010704060085, "pu": 55.854838709677416, "du": 46.65232974910394, "apc": 0.0546875}, {"name": "fig1_p512_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 2.3785464115208015, "e2e_p90": 2.7350583746214396, "e2e_p99": 3.4022513560648044, "e2e_mean": 2.445902353489655, "ttft_p90": 0.7912747941561975, "tpot_p99": 0.009167485321045615, "tps": 482.02537525954966, "wall": 212.436948874034, "pu": 17.535353535353536, "du": 91.0959595959596, "apc": 0.0546875}, {"name": "fig1_p512_pd_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 2.153953853994608, "e2e_p90": 2.2966971063637174, "e2e_p99": 3.044859012498054, "e2e_mean": 2.1845501415853503, "ttft_p90": 0.7954113337909803, "tpot_p99": 0.006337697780334992, "tps": 512.0369567409949, "wall": 199.9855648149969, "pu": 26.21276595744681, "du": 62.944148936170215, "apc": 0.0546875}, {"name": "fig1_p6144_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 2.0121258909930475, "e2e_p90": 2.7345886924420504, "e2e_p99": 3.9082167004665824, "e2e_mean": 2.0664054725714958, "ttft_p90": 0.5843230376602151, "tpot_p99": 0.01391471299074371, "tps": 547.3893798822313, "wall": 187.0697601441061, "pu": 34.44744318181818, "du": null, "apc": 0.65625}, {"name": "fig1_p6144_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 1.6655802099849097, "e2e_p90": 2.1824162190663636, "e2e_p99": 3.4232188416458658, "e2e_mean": 1.811165077216283, "ttft_p90": 0.7816491658217275, "tpot_p99": 0.005717973897763179, "tps": 574.0385687244718, "wall": 178.38522632292006, "pu": 21.53012048192771, "du": 51.61244979919679, "apc": 0.65625}, {"name": "fig1_p6144_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 2.104133589542471, "e2e_p90": 2.6115266072680243, "e2e_p99": 3.388375885724087, "e2e_mean": 2.1801070072923903, "ttft_p90": 0.7013537635677495, "tpot_p99": 0.00888146581120022, "tps": 521.6799713459584, "wall": 196.2889235249022, "pu": 6.709677419354839, "du": 88.70430107526882, "apc": 0.65625}, {"name": "fig1_p6144_pd_4P+4D_rep1", "arm": "4P+4D", "n": 399, "req": 400, "e2e_p50": 1.748427166021429, "e2e_p90": 2.1873498664470388, "e2e_p99": 3.1015963148581767, "e2e_mean": 1.8389387578233134, "ttft_p90": 0.6869595416123048, "tpot_p99": 0.006769578668425845, "tps": 569.2010105657868, "wall": 179.45154366199858, "pu": 13.089285714285714, "du": 68.10119047619048, "apc": 0.65625}, {"name": "fig1_p7168_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 1.9153751800186, "e2e_p90": 2.6549224384711154, "e2e_p99": 3.861135128394235, "e2e_mean": 1.958507129738573, "ttft_p90": 0.5802406982053072, "tpot_p99": 0.013510460370503293, "tps": 564.6374955198476, "wall": 181.35529576498084, "pu": 33.595588235294116, "du": null, "apc": 0.765625}, {"name": "fig1_p7168_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 1.5798494375194423, "e2e_p90": 2.1327995942323468, "e2e_p99": 3.3373043064947687, "e2e_mean": 1.7140900615448482, "ttft_p90": 0.727025477041025, "tpot_p99": 0.005695829117182167, "tps": 593.7272444367894, "wall": 172.46976782602724, "pu": 18.30246913580247, "du": 51.51234567901235, "apc": 0.765625}, {"name": "fig1_p7168_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 2.1042530980193987, "e2e_p90": 2.584451850724873, "e2e_p99": 3.5201327085494967, "e2e_mean": 2.164814479558263, "ttft_p90": 0.6972717452561484, "tpot_p99": 0.009263812077688232, "tps": 527.1747330056295, "wall": 194.24299684504513, "pu": 5.574275362318841, "du": 86.96195652173913, "apc": 0.765625}, {"name": "fig1_p7168_pd_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 1.621718394511845, "e2e_p90": 2.1209186871652492, "e2e_p99": 2.9950801801832854, "e2e_mean": 1.7384027546702419, "ttft_p90": 0.681954695621971, "tpot_p99": 0.006750334063133991, "tps": 587.6633213260067, "wall": 174.24943208799232, "pu": 9.521341463414634, "du": 70.10060975609755, "apc": 0.765625}, {"name": "fig1_p7680_colo_8C-proxy_rep1", "arm": "colo", "n": 400, "req": 400, "e2e_p50": 1.805449907493312, "e2e_p90": 2.2804414638434545, "e2e_p99": 3.2436008435313126, "e2e_mean": 1.8311505301928264, "ttft_p90": 0.5787636383553035, "tpot_p99": 0.01186512264145045, "tps": 586.5851277673049, "wall": 174.5697174249217, "pu": 33.792682926829265, "du": null, "apc": 0.8203125}, {"name": "fig1_p7680_pd2_2P+6D_rep1", "arm": "2P+6D", "n": 400, "req": 400, "e2e_p50": 1.5353662585257553, "e2e_p90": 2.0703843546216376, "e2e_p99": 3.239132529124615, "e2e_mean": 1.6632775271314313, "ttft_p90": 0.684448261326179, "tpot_p99": 0.005772996509146383, "tps": 601.1715360533158, "wall": 170.3340791419614, "pu": 13.35625, "du": 52.83125, "apc": 0.8203125}, {"name": "fig1_p7680_pd6_6P+2D_rep1", "arm": "6P+2D", "n": 400, "req": 400, "e2e_p50": 1.935218213009648, "e2e_p90": 2.540124618355185, "e2e_p99": 3.5381310180295236, "e2e_mean": 2.0565583147658617, "ttft_p90": 0.6883091802941635, "tpot_p99": 0.009528811932669258, "tps": 539.8052979764931, "wall": 189.698027017992, "pu": 5.219101123595506, "du": 90.43820224719101, "apc": 0.8203125}, {"name": "fig1_p7680_pd_4P+4D_rep1", "arm": "4P+4D", "n": 400, "req": 400, "e2e_p50": 1.6052846949896775, "e2e_p90": 2.16270094960928, "e2e_p99": 3.0139038068498483, "e2e_mean": 1.7113545338093537, "ttft_p90": 0.6793354631052353, "tpot_p99": 0.006723990285321704, "tps": 594.143824556681, "wall": 172.34884175797924, "pu": 8.765432098765432, "du": 69.9320987654321, "apc": 0.8203125}]
|