5.2 KiB
5.2 KiB
| 1 | run_id | label | tp | request_count | scale_label | scale_value | fixture | kv_blocks | frontier_completed | frontier_total | frontier_complete | vllm_completed | vllm_total | frontier_preemptions | vllm_preemptions | frontier_prefix_hit | vllm_prefix_hit | prefix_hit_delta | frontier_rps | vllm_rps | rps_ratio | frontier_total_tps | vllm_total_tps | total_tps_ratio | frontier_decode_tps | vllm_decode_tps | decode_tps_ratio | frontier_ttft_p50_s | vllm_ttft_p50_s | ttft_p50_ratio | frontier_ttft_p95_s | vllm_ttft_p95_s | ttft_p95_ratio | frontier_tpot_p50_s | vllm_tpot_p50_s | tpot_p50_ratio | frontier_tpot_p95_s | vllm_tpot_p95_s | tpot_p95_ratio | frontier_e2e_p50_s | vllm_e2e_p50_s | e2e_p50_ratio | frontier_e2e_p95_s | vllm_e2e_p95_s | e2e_p95_ratio | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | tp1_n100_scale1 | TP1 N100 raw | 1 | 100 | raw | 1 | coder_100 | 15281 | 96 | 100 | false | 100 | 100 | 0 | 8 | 0.2487845616 | 0.2510820686 | -0.002297507075 | 0.4048148795 | 0.6879880691 | 0.588403924 | 2348.908821 | 3832.320581 | 0.6129207541 | 347.7992338 | 567.4456795 | 0.6129207541 | 0.9087481136 | 4.503025495 | 0.201808343 | 12.76295815 | 29.06046906 | 0.4391862402 | 0.05688966428 | 0.06608134396 | 0.8609035603 | 0.1456880793 | 0.6211491471 | 0.2345460505 | 30.93928316 | 41.84076733 | 0.7394530534 | 119.6361376 | 97.36622969 | 1.228723121 | Frontier incomplete before lifecycle fix; included as TP1 100-request baseline. |
| 3 | tp1_n500_scale1 | TP1 N500 raw | 1 | 500 | raw | 1 | coder_500 | 15281 | 439 | 500 | false | 500 | 500 | 0 | 63 | 0.1192374692 | 0.3868498695 | -0.2676124002 | 0.660990472 | 0.8401719451 | 0.7867323776 | 4733.748762 | 5282.903731 | 0.896050544 | 656.2204998 | 732.3476384 | 0.896050544 | 136.7755789 | 185.6581683 | 0.7367064976 | 340.2371222 | 375.8950067 | 0.9051387119 | 0.05643274739 | 0.04975253624 | 1.134268756 | 0.08942839773 | 0.0918798539 | 0.9733188935 | 177.7998574 | 224.2697872 | 0.7927945162 | 397.29145 | 417.3562933 | 0.9519239469 | Frontier incomplete; useful as high-pressure stress signal. |
| 4 | tp1_n200_scale0667 | TP1 N200 scale 0.667 | 1 | 200 | 0.667 | 0.6666666667 | coder_200_ts0667 | 15281 | 176 | 200 | false | 200 | 200 | 0 | 26 | 0.170276008 | 0.2697549478 | -0.09947893984 | 0.5830903706 | 0.8236788215 | 0.7079098737 | 3913.437526 | 4864.778909 | 0.8044430383 | 593.287826 | 737.51378 | 0.8044430383 | 20.58014532 | 34.56323652 | 0.595434554 | 96.71793818 | 120.8039818 | 0.800618794 | 0.05837096651 | 0.05145431897 | 1.13442307 | 0.235894569 | 0.2534757496 | 0.9306395954 | 73.20731169 | 83.6219905 | 0.875455263 | 189.2402903 | 183.726977 | 1.030008186 | Dense-arrival run; Frontier incomplete before lifecycle fix. |
| 5 | tp1_n200_scale2 | TP1 N200 scale 2 | 1 | 200 | 2 | 2 | coder_200_ts2 | 15281 | 200 | 200 | true | 200 | 200 | 33 | 43 | 0.23134169 | 0.2697549478 | -0.03841325784 | 0.5936627655 | 0.8029813635 | 0.7393232178 | 3506.267279 | 4742.53641 | 0.7393232178 | 531.5597036 | 718.9814831 | 0.7393232178 | 9.595321274 | 9.216767096 | 1.041072338 | 77.50341053 | 69.21141595 | 1.119806747 | 0.05421362546 | 0.04970337519 | 1.09074334 | 0.06653162646 | 0.06863309532 | 0.9693811149 | 61.45769412 | 55.00248734 | 1.117362088 | 174.4840836 | 142.3375087 | 1.225847531 | After Frontier decode-preemption lifecycle fix. |
| 6 | tp1_n200_scale3 | TP1 N200 scale 3 | 1 | 200 | 3 | 3 | coder_200_ts3 | 15281 | 200 | 200 | true | 200 | 200 | 20 | 16 | 0.2176751278 | 0.2697549478 | -0.05207982007 | 0.5739781652 | 0.7802265504 | 0.735655772 | 3390.00688 | 4608.142843 | 0.735655772 | 513.9343094 | 698.607051 | 0.735655772 | 1.001474116 | 1.166151478 | 0.8587856162 | 45.9466567 | 32.25842447 | 1.424330464 | 0.05339333437 | 0.04616159714 | 1.156661331 | 0.06861254671 | 0.0713836296 | 0.9611804148 | 44.76058145 | 33.21267588 | 1.34769573 | 154.5483135 | 122.7887113 | 1.258652459 | After Frontier decode-preemption lifecycle fix. |
| 7 | tp2_n200_scale2 | TP2 N200 scale 2 | 2 | 200 | 2 | 2 | coder_200_ts2 | 69055 | 200 | 200 | true | 200 | 200 | 0 | 0 | 0.2697549478 | 0.2697549478 | 0 | 0.7756823572 | 1.277818683 | 0.607036325 | 4581.304111 | 7547.001591 | 0.607036325 | 694.5382258 | 1144.14607 | 0.607036325 | 0.2690959621 | 0.225119116 | 1.195349231 | 6.744624223 | 0.715071776 | 9.432094022 | 0.04295527658 | 0.03004499679 | 1.429698158 | 0.05288764732 | 0.04340382318 | 1.218502046 | 26.05122482 | 16.44861007 | 1.583794905 | 106.7591651 | 72.5347179 | 1.471835394 | Uses true-mixed TP2/TP4 attention profile. |
| 8 | tp2_n200_scale3 | TP2 N200 scale 3 | 2 | 200 | 3 | 3 | coder_200_ts3 | 69055 | 200 | 200 | true | 200 | 200 | 0 | 0 | 0.2697549478 | 0.2697549478 | 0 | 0.6877705321 | 1.088050278 | 0.6321128225 | 4062.082806 | 6426.199028 | 0.6321128225 | 615.8228567 | 974.2293382 | 0.6321128225 | 0.1341535495 | 0.153530943 | 0.8737883511 | 0.5741378218 | 0.6270455511 | 0.9156237864 | 0.03937896849 | 0.01905767256 | 2.06630523 | 0.04670767225 | 0.02799082097 | 1.668678182 | 21.78596494 | 9.956003374 | 2.188223941 | 101.5918393 | 53.98348621 | 1.881905864 | Uses true-mixed TP2/TP4 attention profile. |
| 9 | tp4_n200_scale2 | TP4 N200 scale 2 | 4 | 200 | 2 | 2 | coder_200_ts2 | 177077 | 200 | 200 | true | 200 | 200 | 0 | 0 | 0.2697549478 | 0.2697549478 | 0 | 0.8525337931 | 1.536203537 | 0.5549614829 | 5035.200987 | 9073.063884 | 0.5549614829 | 763.350233 | 1375.501285 | 0.5549614829 | 0.09755515041 | 0.1704972619 | 0.5721801589 | 0.3856872342 | 1.419861408 | 0.2716372401 | 0.03366585047 | 0.01634437735 | 2.059781767 | 0.03838265621 | 0.02831690026 | 1.355468143 | 18.65216282 | 9.260885488 | 2.014079846 | 84.93775414 | 43.62188903 | 1.947136083 | Uses true-mixed TP2/TP4 attention profile. |
| 10 | tp4_n200_scale3 | TP4 N200 scale 3 | 4 | 200 | 3 | 3 | coder_200_ts3 | 177077 | 200 | 200 | true | 200 | 200 | 0 | 0 | 0.2697549478 | 0.2697549478 | 0 | 0.7373665172 | 1.253504493 | 0.5882440162 | 4355.004629 | 7403.398096 | 0.5882440162 | 660.2306059 | 1122.375388 | 0.5882440162 | 0.08859749135 | 0.100106278 | 0.885034317 | 0.3458954617 | 0.3184188101 | 1.086290919 | 0.03106778109 | 0.009410284212 | 3.301471071 | 0.03578285082 | 0.01279276668 | 2.79711588 | 16.90291941 | 5.54948732 | 3.045852424 | 83.00995365 | 27.86907583 | 2.978568581 | Uses true-mixed TP2/TP4 attention profile. |