Files
agentic-kvc/analysis/mb2/intra_kvboth_breakdown.json
Gahow Wang de164e5a64 MB2: pure KV-transfer cost on dash1 intra-node — Mooncake ~9.7 GB/s steady
Full sweep result on dash1 GPU 0+1 with vanilla vLLM 0.18.1 +
mooncake-transfer-engine 0.3.11, kv_both connector. Per-stage decomposition
via the instrumentation patch (analyze_mb2.py pairs A's send_blocks with
B's receive_kv enter/finish by time window).

Steady-state (1k..32k tokens, 96 MiB..3 GiB KV):
   pure_transfer ≈ size / 9.7 GB/s
   rx_overhead   ≈ 2–3 ms (ZMQ handshake + P-side setup)
   bandwidth     ≈ 9.6–10.1 GB/s, very stable

Large-size regime (65k..131k tokens, 6..12 GiB):
   p50 bandwidth collapses to 3.4–4.5 GB/s
   max bandwidth still hits ~9.7 GB/s (some runs achieve it)
   p99 agentic request (11.5 GiB) lands here

Implication for §3.2 PD-disaggregation cost argument:
   median agentic decode = 50–200 ms (tool-call JSON output)
   median agentic-tail KV transfer (p99 11.5 GiB):
     best case (9.7 GB/s)  ≈ 1.19 s
     observed range         1.5 – 10 s
   ⇒ KV transfer is 8–100× larger than the decode it enables.

This is intra-node — the lower-bound transfer cost. Inter-node RDMA
will be slower; that's MB2 phase 2.

Adds:
- analyze_mb2.py: pair A.send_blocks ↔ B.receive_kv by time window;
  per-size aggregation (n, ms_p50, ms_min/max, GB/s_p50/max)
- plot_mb2.py: log-log transfer-time chart + bandwidth-vs-size chart
- analysis/mb2/A_intra_kvboth.jsonl, B_intra_kvboth.jsonl: raw events
  (51 + 102 events including the sanity preamble)
- analysis/mb2/intra_kvboth_breakdown.json: paired and aggregated
- figs/mb2_transfer_time_intra.png, figs/mb2_transfer_bw_intra.png

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 19:04:03 +08:00

758 lines
22 KiB
JSON

{
"rows": [
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.023202952987048775,
"rx_total_s": 0.03333390498301014,
"rx_overhead_s": 0.010130951995961368,
"rx_t_start_unix": 1779879143.1678784,
"send_t_start_unix": 1779879143.174031,
"req_ids": [
"cmpl-ad00672f263a6643-0-9479211a"
]
},
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.005375694017857313,
"rx_total_s": 0.007019245007541031,
"rx_overhead_s": 0.0016435509896837175,
"rx_t_start_unix": 1779879143.2968972,
"send_t_start_unix": 1779879143.2982283,
"req_ids": [
"cmpl-ace77e2b02f9f141-0-b3c061bc"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.021170366962905973,
"rx_total_s": 0.02278437599306926,
"rx_overhead_s": 0.0016140090301632881,
"rx_t_start_unix": 1779879143.5146625,
"send_t_start_unix": 1779879143.5159554,
"req_ids": [
"cmpl-a4a2366879c68ded-0-8ac4098e"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.020726953051052988,
"rx_total_s": 0.022794076008722186,
"rx_overhead_s": 0.0020671229576691985,
"rx_t_start_unix": 1779879143.6958342,
"send_t_start_unix": 1779879143.6974514,
"req_ids": [
"cmpl-8690cafcace0d5e2-0-b89f33d2"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08536655298667029,
"rx_total_s": 0.08753501297906041,
"rx_overhead_s": 0.002168459992390126,
"rx_t_start_unix": 1779879144.3279662,
"send_t_start_unix": 1779879144.3294952,
"req_ids": [
"cmpl-b087e2ec4cfa8eb7-0-b908f425"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08367906499188393,
"rx_total_s": 0.0860149699728936,
"rx_overhead_s": 0.002335904981009662,
"rx_t_start_unix": 1779879145.040141,
"send_t_start_unix": 1779879145.0419943,
"req_ids": [
"cmpl-a115d16ff5575e08-0-9fa81984"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.0004059679922647774,
"rx_total_s": 0.002459956973325461,
"rx_overhead_s": 0.0020539889810606837,
"rx_t_start_unix": 1779879221.7062025,
"send_t_start_unix": 1779879221.7078288,
"req_ids": [
"cmpl-9e585ed083951df5-0-b03f812b"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.000346789020113647,
"rx_total_s": 0.0020201010047458112,
"rx_overhead_s": 0.0016733119846321642,
"rx_t_start_unix": 1779879221.7826598,
"send_t_start_unix": 1779879221.7838593,
"req_ids": [
"cmpl-9271d403c044eadd-0-9c3c4639"
]
},
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.005353622022084892,
"rx_total_s": 0.006836243963334709,
"rx_overhead_s": 0.0014826219412498176,
"rx_t_start_unix": 1779879221.859549,
"send_t_start_unix": 1779879221.8607252,
"req_ids": [
"cmpl-82a580cefd3e2440-0-a383c3c4"
]
},
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.005279594974126667,
"rx_total_s": 0.00694335694424808,
"rx_overhead_s": 0.0016637619701214135,
"rx_t_start_unix": 1779879221.9419758,
"send_t_start_unix": 1779879221.9432015,
"req_ids": [
"cmpl-a31cb4bc9e7f63d2-0-8f48aacd"
]
},
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.0053006180096417665,
"rx_total_s": 0.006697195000015199,
"rx_overhead_s": 0.0013965769903734326,
"rx_t_start_unix": 1779879222.0232244,
"send_t_start_unix": 1779879222.0243337,
"req_ids": [
"cmpl-a9dfc1a5b425d994-0-a0930098"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010396577999927104,
"rx_total_s": 0.01183948403922841,
"rx_overhead_s": 0.001442906039301306,
"rx_t_start_unix": 1779879222.1297998,
"send_t_start_unix": 1779879222.130936,
"req_ids": [
"cmpl-9712857755af2efc-0-90b2dc9b"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010438029014039785,
"rx_total_s": 0.01214482297655195,
"rx_overhead_s": 0.0017067939625121653,
"rx_t_start_unix": 1779879222.243023,
"send_t_start_unix": 1779879222.2442062,
"req_ids": [
"cmpl-b4f0a10dee65acbe-0-a3c132fc"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010436972021125257,
"rx_total_s": 0.011961110983975232,
"rx_overhead_s": 0.0015241389628499746,
"rx_t_start_unix": 1779879222.3569698,
"send_t_start_unix": 1779879222.3581295,
"req_ids": [
"cmpl-b4c514b80b52a3f2-0-bcd24f8e"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010396371013484895,
"rx_total_s": 0.011788576026447117,
"rx_overhead_s": 0.001392205012962222,
"rx_t_start_unix": 1779879222.4715128,
"send_t_start_unix": 1779879222.4725878,
"req_ids": [
"cmpl-ac7118d8090d181c-0-8af4adf0"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010352785000577569,
"rx_total_s": 0.0118055299972184,
"rx_overhead_s": 0.0014527449966408312,
"rx_t_start_unix": 1779879222.5826046,
"send_t_start_unix": 1779879222.5837166,
"req_ids": [
"cmpl-85291bcb93aaf638-0-868db1a8"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.00034007197245955467,
"rx_total_s": 0.0021119200391694903,
"rx_overhead_s": 0.0017718480667099357,
"rx_t_start_unix": 1779879222.750828,
"send_t_start_unix": 1779879222.7521152,
"req_ids": [
"cmpl-a448cf2e059ba0c9-0-a1360796"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.00041691696969792247,
"rx_total_s": 0.0022232600022107363,
"rx_overhead_s": 0.0018063430325128138,
"rx_t_start_unix": 1779879222.913044,
"send_t_start_unix": 1779879222.9143836,
"req_ids": [
"cmpl-b486fd9e945a4658-0-8bb561cd"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.020633380976505578,
"rx_total_s": 0.022250515001360327,
"rx_overhead_s": 0.0016171340248547494,
"rx_t_start_unix": 1779879223.0765986,
"send_t_start_unix": 1779879223.0778644,
"req_ids": [
"cmpl-82da2bfe65f276c6-0-88d9a9a2"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.020639199996367097,
"rx_total_s": 0.022157608007546514,
"rx_overhead_s": 0.0015184080111794174,
"rx_t_start_unix": 1779879223.2591784,
"send_t_start_unix": 1779879223.2603853,
"req_ids": [
"cmpl-93bd777652eba5f3-0-9ec3d058"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.020575353992171586,
"rx_total_s": 0.022589912987314165,
"rx_overhead_s": 0.002014558995142579,
"rx_t_start_unix": 1779879223.4402068,
"send_t_start_unix": 1779879223.4418828,
"req_ids": [
"cmpl-81f950480a3cabf9-0-bbf8584f"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.041439525957684964,
"rx_total_s": 0.043345845013391227,
"rx_overhead_s": 0.0019063190557062626,
"rx_t_start_unix": 1779879223.7529812,
"send_t_start_unix": 1779879223.7544343,
"req_ids": [
"cmpl-b109ed06b5882659-0-8d14993c"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.04152030003024265,
"rx_total_s": 0.04341953102266416,
"rx_overhead_s": 0.0018992309924215078,
"rx_t_start_unix": 1779879224.0899644,
"send_t_start_unix": 1779879224.0914912,
"req_ids": [
"cmpl-8a57776c81d64b2c-0-ace8fb2b"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.04148670402355492,
"rx_total_s": 0.04336977802449837,
"rx_overhead_s": 0.0018830740009434521,
"rx_t_start_unix": 1779879224.424807,
"send_t_start_unix": 1779879224.4262393,
"req_ids": [
"cmpl-9b1a5dce18758450-0-b17b3649"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.04146742797456682,
"rx_total_s": 0.043769759009592235,
"rx_overhead_s": 0.002302331035025418,
"rx_t_start_unix": 1779879224.7599711,
"send_t_start_unix": 1779879224.7617002,
"req_ids": [
"cmpl-8c7d412b85f43ed7-0-9dea4add"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.04143296502297744,
"rx_total_s": 0.043612666020635515,
"rx_overhead_s": 0.002179700997658074,
"rx_t_start_unix": 1779879225.0962389,
"send_t_start_unix": 1779879225.0978234,
"req_ids": [
"cmpl-8860308db3f010a5-0-ad51eb46"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.0003991159610450268,
"rx_total_s": 0.002386144013144076,
"rx_overhead_s": 0.001987028052099049,
"rx_t_start_unix": 1779879225.7592747,
"send_t_start_unix": 1779879225.760789,
"req_ids": [
"cmpl-86cca1a2b9427801-0-ba41ade7"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.00041423802031204104,
"rx_total_s": 0.0023903060355223715,
"rx_overhead_s": 0.0019760680152103305,
"rx_t_start_unix": 1779879226.384918,
"send_t_start_unix": 1779879226.3864496,
"req_ids": [
"cmpl-a208c6d804293be7-0-94d265ab"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08309489500243217,
"rx_total_s": 0.08524628396844491,
"rx_overhead_s": 0.002151388966012746,
"rx_t_start_unix": 1779879227.0092332,
"send_t_start_unix": 1779879227.0107942,
"req_ids": [
"cmpl-b53bea2317cc1211-0-8fcad8a8"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08372796402545646,
"rx_total_s": 0.08596085698809475,
"rx_overhead_s": 0.0022328929626382887,
"rx_t_start_unix": 1779879227.7190688,
"send_t_start_unix": 1779879227.7207224,
"req_ids": [
"cmpl-9daf909593bbdf03-0-8fd7d50e"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08398396399570629,
"rx_total_s": 0.0860762019874528,
"rx_overhead_s": 0.002092237991746515,
"rx_t_start_unix": 1779879228.4297745,
"send_t_start_unix": 1779879228.4314566,
"req_ids": [
"cmpl-9ef40f3b6d736128-0-8e8e1c30"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.16950496198842302,
"rx_total_s": 0.1721468890318647,
"rx_overhead_s": 0.002641927043441683,
"rx_t_start_unix": 1779879230.131392,
"send_t_start_unix": 1779879230.1334376,
"req_ids": [
"cmpl-851e5d7e3e83d7ea-0-a66a5e0b"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.16713789198547602,
"rx_total_s": 0.16974544001277536,
"rx_overhead_s": 0.0026075480272993445,
"rx_t_start_unix": 1779879231.896075,
"send_t_start_unix": 1779879231.8981037,
"req_ids": [
"cmpl-9be12af6a9ccccf5-0-af1230c7"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.16713115200400352,
"rx_total_s": 0.16975757898762822,
"rx_overhead_s": 0.0026264269836246967,
"rx_t_start_unix": 1779879233.6589305,
"send_t_start_unix": 1779879233.6608078,
"req_ids": [
"cmpl-b61b9b237366297b-0-9832f0e3"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.16709016199456528,
"rx_total_s": 0.1695251659839414,
"rx_overhead_s": 0.0024350039893761277,
"rx_t_start_unix": 1779879235.4181106,
"send_t_start_unix": 1779879235.419875,
"req_ids": [
"cmpl-bae0d0efe47ece8f-0-affbc685"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.166486973001156,
"rx_total_s": 0.16962904302636161,
"rx_overhead_s": 0.003142070025205612,
"rx_t_start_unix": 1779879237.1803744,
"send_t_start_unix": 1779879237.1821773,
"req_ids": [
"cmpl-a34bc73c9cd2efc1-0-90d647fc"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.31926770601421595,
"rx_total_s": 0.32203804596792907,
"rx_overhead_s": 0.002770339953713119,
"rx_t_start_unix": 1779879241.9859307,
"send_t_start_unix": 1779879241.9880297,
"req_ids": [
"cmpl-89a36c12ee6b0ff3-0-9fddbc0f"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.3197040680097416,
"rx_total_s": 0.3227974839974195,
"rx_overhead_s": 0.003093415987677872,
"rx_t_start_unix": 1779879246.9755645,
"send_t_start_unix": 1779879246.9779432,
"req_ids": [
"cmpl-8d65512eb7e3c36c-0-8b23597c"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.32088329299585894,
"rx_total_s": 0.3240378479822539,
"rx_overhead_s": 0.003154554986394942,
"rx_t_start_unix": 1779879251.9618897,
"send_t_start_unix": 1779879251.9643052,
"req_ids": [
"cmpl-a13c271ecbbca78b-0-b76a0370"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.5439103110111319,
"rx_total_s": 0.5924434679909609,
"rx_overhead_s": 0.04853315697982907,
"rx_t_start_unix": 1779879256.9512377,
"send_t_start_unix": 1779879256.9989722,
"req_ids": [
"cmpl-bada04ec8c556aca-0-a263d637"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.5193864739849232,
"rx_total_s": 0.5644763479940593,
"rx_overhead_s": 0.04508987400913611,
"rx_t_start_unix": 1779879262.2127163,
"send_t_start_unix": 1779879262.2562187,
"req_ids": [
"cmpl-9641a077022e6123-0-8c3c0975"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 1.9844180009677075,
"rx_total_s": 2.0784930550144054,
"rx_overhead_s": 0.09407505404669791,
"rx_t_start_unix": 1779879278.1063075,
"send_t_start_unix": 1779879278.199048,
"req_ids": [
"cmpl-bb3a4e5084af8c3a-0-bdfa0931"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 2.1099297259934247,
"rx_total_s": 2.2067435560165904,
"rx_overhead_s": 0.09681383002316579,
"rx_t_start_unix": 1779879295.600993,
"send_t_start_unix": 1779879295.6967168,
"req_ids": [
"cmpl-91b951f85c93a71b-0-8396bee5"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 1.8950715209939517,
"rx_total_s": 1.9879729640088044,
"rx_overhead_s": 0.0929014430148527,
"rx_t_start_unix": 1779879313.2315958,
"send_t_start_unix": 1779879313.3236735,
"req_ids": [
"cmpl-81d236ecb6aadadf-0-ac184d51"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 0.9277855920372531,
"rx_total_s": 0.9849357060156763,
"rx_overhead_s": 0.05715011397842318,
"rx_t_start_unix": 1779879330.6154163,
"send_t_start_unix": 1779879330.6715357,
"req_ids": [
"cmpl-a4c76c62b44c4295-0-b007a6ed"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 0.6652462020283565,
"rx_total_s": 0.6725030990201049,
"rx_overhead_s": 0.007256896991748363,
"rx_t_start_unix": 1779879346.990221,
"send_t_start_unix": 1779879346.9950044,
"req_ids": [
"cmpl-a06d4b774a8af9a5-0-980e9d23"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 1.3330365709844045,
"rx_total_s": 1.3384539679973386,
"rx_overhead_s": 0.005417397012934089,
"rx_t_start_unix": 1779879402.7123013,
"send_t_start_unix": 1779879402.7169023,
"req_ids": [
"cmpl-bf0d435e06e3349f-0-8507c933"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 5.839069904992357,
"rx_total_s": 5.973284716019407,
"rx_overhead_s": 0.13421481102705002,
"rx_t_start_unix": 1779879458.9232886,
"send_t_start_unix": 1779879459.0566247,
"req_ids": [
"cmpl-9f87ae0fb0c7eec8-0-a8a1daea"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 9.862486142024864,
"rx_total_s": 10.056511385017075,
"rx_overhead_s": 0.19402524299221113,
"rx_t_start_unix": 1779879519.7647448,
"send_t_start_unix": 1779879519.9567635,
"req_ids": [
"cmpl-a62e48e40e6c6ad7-0-acca9741"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 2.8350498770014383,
"rx_total_s": 2.925714804965537,
"rx_overhead_s": 0.09066492796409875,
"rx_t_start_unix": 1779879584.888362,
"send_t_start_unix": 1779879584.9780834,
"req_ids": [
"cmpl-824479d53bab40e4-0-af951a11"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 1.485496642999351,
"rx_total_s": 1.5183607729850337,
"rx_overhead_s": 0.032864129985682666,
"rx_t_start_unix": 1779879642.6076336,
"send_t_start_unix": 1779879642.639775,
"req_ids": [
"cmpl-9f06f19c981c0b3f-0-b3afb370"
]
}
],
"summary": [
{
"input_tokens": 16,
"kv_mib": 1.5,
"n": 6,
"pure_transfer_ms_mean": 0.39,
"pure_transfer_ms_p50": 0.4,
"pure_transfer_ms_max": 0.42,
"pure_transfer_ms_min": 0.34,
"rx_total_ms_mean": 2.27,
"rx_overhead_ms_mean": 1.88,
"throughput_gbps_mean": 4.09,
"throughput_gbps_p50": 3.91,
"throughput_gbps_max": 4.63
},
{
"input_tokens": 512,
"kv_mib": 48.0,
"n": 5,
"pure_transfer_ms_mean": 8.9,
"pure_transfer_ms_p50": 5.35,
"pure_transfer_ms_max": 23.2,
"pure_transfer_ms_min": 5.28,
"rx_total_ms_mean": 12.17,
"rx_overhead_ms_mean": 3.26,
"throughput_gbps_mean": 7.99,
"throughput_gbps_p50": 9.4,
"throughput_gbps_max": 9.53
},
{
"input_tokens": 1024,
"kv_mib": 96.0,
"n": 5,
"pure_transfer_ms_mean": 10.4,
"pure_transfer_ms_p50": 10.4,
"pure_transfer_ms_max": 10.44,
"pure_transfer_ms_min": 10.35,
"rx_total_ms_mean": 11.91,
"rx_overhead_ms_mean": 1.5,
"throughput_gbps_mean": 9.68,
"throughput_gbps_p50": 9.68,
"throughput_gbps_max": 9.72
},
{
"input_tokens": 2048,
"kv_mib": 192.0,
"n": 5,
"pure_transfer_ms_mean": 20.75,
"pure_transfer_ms_p50": 20.64,
"pure_transfer_ms_max": 21.17,
"pure_transfer_ms_min": 20.58,
"rx_total_ms_mean": 22.52,
"rx_overhead_ms_mean": 1.77,
"throughput_gbps_mean": 9.7,
"throughput_gbps_p50": 9.75,
"throughput_gbps_max": 9.78
},
{
"input_tokens": 4096,
"kv_mib": 384.0,
"n": 5,
"pure_transfer_ms_mean": 41.47,
"pure_transfer_ms_p50": 41.47,
"pure_transfer_ms_max": 41.52,
"pure_transfer_ms_min": 41.43,
"rx_total_ms_mean": 43.5,
"rx_overhead_ms_mean": 2.03,
"throughput_gbps_mean": 9.71,
"throughput_gbps_p50": 9.71,
"throughput_gbps_max": 9.72
},
{
"input_tokens": 8192,
"kv_mib": 768.0,
"n": 5,
"pure_transfer_ms_mean": 83.97,
"pure_transfer_ms_p50": 83.73,
"pure_transfer_ms_max": 85.37,
"pure_transfer_ms_min": 83.09,
"rx_total_ms_mean": 86.17,
"rx_overhead_ms_mean": 2.2,
"throughput_gbps_mean": 9.59,
"throughput_gbps_p50": 9.62,
"throughput_gbps_max": 9.69
},
{
"input_tokens": 16384,
"kv_mib": 1536.0,
"n": 5,
"pure_transfer_ms_mean": 167.47,
"pure_transfer_ms_p50": 167.13,
"pure_transfer_ms_max": 169.5,
"pure_transfer_ms_min": 166.49,
"rx_total_ms_mean": 170.16,
"rx_overhead_ms_mean": 2.69,
"throughput_gbps_mean": 9.62,
"throughput_gbps_p50": 9.64,
"throughput_gbps_max": 9.67
},
{
"input_tokens": 32768,
"kv_mib": 3072.0,
"n": 5,
"pure_transfer_ms_mean": 404.63,
"pure_transfer_ms_p50": 320.88,
"pure_transfer_ms_max": 543.91,
"pure_transfer_ms_min": 319.27,
"rx_total_ms_mean": 425.16,
"rx_overhead_ms_mean": 20.53,
"throughput_gbps_mean": 8.47,
"throughput_gbps_p50": 10.04,
"throughput_gbps_max": 10.09
},
{
"input_tokens": 65536,
"kv_mib": 6144.0,
"n": 5,
"pure_transfer_ms_mean": 1516.49,
"pure_transfer_ms_p50": 1895.07,
"pure_transfer_ms_max": 2109.93,
"pure_transfer_ms_min": 665.25,
"rx_total_ms_mean": 1586.13,
"rx_overhead_ms_mean": 69.64,
"throughput_gbps_mean": 5.27,
"throughput_gbps_p50": 3.4,
"throughput_gbps_max": 9.68
},
{
"input_tokens": 131072,
"kv_mib": 12288.0,
"n": 5,
"pure_transfer_ms_mean": 4271.03,
"pure_transfer_ms_p50": 2835.05,
"pure_transfer_ms_max": 9862.49,
"pure_transfer_ms_min": 1333.04,
"rx_total_ms_mean": 4362.47,
"rx_overhead_ms_mean": 91.44,
"throughput_gbps_mean": 5.28,
"throughput_gbps_p50": 4.54,
"throughput_gbps_max": 9.67
}
]
}