MB2: pure KV-transfer cost on dash1 intra-node — Mooncake ~9.7 GB/s steady

Full sweep result on dash1 GPU 0+1 with vanilla vLLM 0.18.1 +
mooncake-transfer-engine 0.3.11, kv_both connector. Per-stage decomposition
via the instrumentation patch (analyze_mb2.py pairs A's send_blocks with
B's receive_kv enter/finish by time window).

Steady-state (1k..32k tokens, 96 MiB..3 GiB KV):
   pure_transfer ≈ size / 9.7 GB/s
   rx_overhead   ≈ 2–3 ms (ZMQ handshake + P-side setup)
   bandwidth     ≈ 9.6–10.1 GB/s, very stable

Large-size regime (65k..131k tokens, 6..12 GiB):
   p50 bandwidth collapses to 3.4–4.5 GB/s
   max bandwidth still hits ~9.7 GB/s (some runs achieve it)
   p99 agentic request (11.5 GiB) lands here

Implication for §3.2 PD-disaggregation cost argument:
   median agentic decode = 50–200 ms (tool-call JSON output)
   median agentic-tail KV transfer (p99 11.5 GiB):
     best case (9.7 GB/s)  ≈ 1.19 s
     observed range         1.5 – 10 s
   ⇒ KV transfer is 8–100× larger than the decode it enables.

This is intra-node — the lower-bound transfer cost. Inter-node RDMA
will be slower; that's MB2 phase 2.

Adds:
- analyze_mb2.py: pair A.send_blocks ↔ B.receive_kv by time window;
  per-size aggregation (n, ms_p50, ms_min/max, GB/s_p50/max)
- plot_mb2.py: log-log transfer-time chart + bandwidth-vs-size chart
- analysis/mb2/A_intra_kvboth.jsonl, B_intra_kvboth.jsonl: raw events
  (51 + 102 events including the sanity preamble)
- analysis/mb2/intra_kvboth_breakdown.json: paired and aggregated
- figs/mb2_transfer_time_intra.png, figs/mb2_transfer_bw_intra.png

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-27 19:04:03 +08:00
parent 91673f1fb8
commit de164e5a64
7 changed files with 1189 additions and 0 deletions

View File

@@ -0,0 +1,51 @@
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 50331648, "duration_s": 0.023202952987048775, "t_start_unix": 1779879143.174031, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879143.1972432}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 50331648, "duration_s": 0.005375694017857313, "t_start_unix": 1779879143.2982283, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879143.3036084}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 201326592, "duration_s": 0.021170366962905973, "t_start_unix": 1779879143.5159554, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879143.5371296}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 201326592, "duration_s": 0.020726953051052988, "t_start_unix": 1779879143.6974514, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879143.7181835}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 805306368, "duration_s": 0.08536655298667029, "t_start_unix": 1779879144.3294952, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879144.4148676}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 805306368, "duration_s": 0.08367906499188393, "t_start_unix": 1779879145.0419943, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879145.125678}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1572864, "duration_s": 0.0004059679922647774, "t_start_unix": 1779879221.7078288, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879221.7082384}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1572864, "duration_s": 0.000346789020113647, "t_start_unix": 1779879221.7838593, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879221.7842083}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 50331648, "duration_s": 0.005353622022084892, "t_start_unix": 1779879221.8607252, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879221.8660822}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 50331648, "duration_s": 0.005279594974126667, "t_start_unix": 1779879221.9432015, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879221.9484842}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 50331648, "duration_s": 0.0053006180096417665, "t_start_unix": 1779879222.0243337, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879222.0296378}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 100663296, "duration_s": 0.010396577999927104, "t_start_unix": 1779879222.130936, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879222.141335}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 100663296, "duration_s": 0.010438029014039785, "t_start_unix": 1779879222.2442062, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879222.2546473}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 100663296, "duration_s": 0.010436972021125257, "t_start_unix": 1779879222.3581295, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879222.3685696}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 100663296, "duration_s": 0.010396371013484895, "t_start_unix": 1779879222.4725878, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879222.4829867}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 100663296, "duration_s": 0.010352785000577569, "t_start_unix": 1779879222.5837166, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879222.5940716}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1572864, "duration_s": 0.00034007197245955467, "t_start_unix": 1779879222.7521152, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879222.752458}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1572864, "duration_s": 0.00041691696969792247, "t_start_unix": 1779879222.9143836, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879222.914805}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 201326592, "duration_s": 0.020633380976505578, "t_start_unix": 1779879223.0778644, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879223.098502}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 201326592, "duration_s": 0.020639199996367097, "t_start_unix": 1779879223.2603853, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879223.2810278}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 201326592, "duration_s": 0.020575353992171586, "t_start_unix": 1779879223.4418828, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879223.4624615}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 402653184, "duration_s": 0.041439525957684964, "t_start_unix": 1779879223.7544343, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879223.7958782}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 402653184, "duration_s": 0.04152030003024265, "t_start_unix": 1779879224.0914912, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879224.133016}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 402653184, "duration_s": 0.04148670402355492, "t_start_unix": 1779879224.4262393, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879224.4677298}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 402653184, "duration_s": 0.04146742797456682, "t_start_unix": 1779879224.7617002, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879224.8031723}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 402653184, "duration_s": 0.04143296502297744, "t_start_unix": 1779879225.0978234, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879225.1392617}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1572864, "duration_s": 0.0003991159610450268, "t_start_unix": 1779879225.760789, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879225.7611918}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1572864, "duration_s": 0.00041423802031204104, "t_start_unix": 1779879226.3864496, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879226.3868673}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 805306368, "duration_s": 0.08309489500243217, "t_start_unix": 1779879227.0107942, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879227.0938945}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 805306368, "duration_s": 0.08372796402545646, "t_start_unix": 1779879227.7207224, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879227.8044555}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 805306368, "duration_s": 0.08398396399570629, "t_start_unix": 1779879228.4314566, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879228.5154452}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1610612736, "duration_s": 0.16950496198842302, "t_start_unix": 1779879230.1334376, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879230.3029544}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1610612736, "duration_s": 0.16713789198547602, "t_start_unix": 1779879231.8981037, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879232.0652575}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1610612736, "duration_s": 0.16713115200400352, "t_start_unix": 1779879233.6608078, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879233.827945}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1610612736, "duration_s": 0.16709016199456528, "t_start_unix": 1779879235.419875, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879235.5869706}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 1610612736, "duration_s": 0.166486973001156, "t_start_unix": 1779879237.1821773, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879237.34867}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 3221225472, "duration_s": 0.31926770601421595, "t_start_unix": 1779879241.9880297, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879242.3073065}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 3221225472, "duration_s": 0.3197040680097416, "t_start_unix": 1779879246.9779432, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879247.2976692}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 3221225472, "duration_s": 0.32088329299585894, "t_start_unix": 1779879251.9643052, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879252.285209}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 3221225472, "duration_s": 0.5439103110111319, "t_start_unix": 1779879256.9989722, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879257.5428913}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 3221225472, "duration_s": 0.5193864739849232, "t_start_unix": 1779879262.2562187, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879262.7756212}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 6442450944, "duration_s": 1.9844180009677075, "t_start_unix": 1779879278.199048, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879280.1834915}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 6442450944, "duration_s": 2.1099297259934247, "t_start_unix": 1779879295.6967168, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879297.8066647}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 6442450944, "duration_s": 1.8950715209939517, "t_start_unix": 1779879313.3236735, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879315.2187643}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 6442450944, "duration_s": 0.9277855920372531, "t_start_unix": 1779879330.6715357, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879331.599329}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 6442450944, "duration_s": 0.6652462020283565, "t_start_unix": 1779879346.9950044, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879347.6602724}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 12884901888, "duration_s": 1.3330365709844045, "t_start_unix": 1779879402.7169023, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879404.04997}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 12884901888, "duration_s": 5.839069904992357, "t_start_unix": 1779879459.0566247, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879464.8957155}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 12884901888, "duration_s": 9.862486142024864, "t_start_unix": 1779879519.9567635, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879529.8192694}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 12884901888, "duration_s": 2.8350498770014383, "t_start_unix": 1779879584.9780834, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879587.813154}
{"event": "send_blocks", "remote_session": "172.27.123.142:16428", "total_bytes": 12884901888, "duration_s": 1.485496642999351, "t_start_unix": 1779879642.639775, "ret": 0, "tp_rank": 0, "t_log_unix": 1779879644.1252885}

View File

@@ -0,0 +1,102 @@
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-ad00672f263a6643-0-9479211a"], "t_start_unix": 1779879143.1678784, "tp_rank": 0, "t_log_unix": 1779879143.167884}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-ad00672f263a6643-0-9479211a"], "duration_s": 0.03333390498301014, "t_start_unix": 1779879143.1678784, "tp_rank": 0, "t_log_unix": 1779879143.201217}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-ace77e2b02f9f141-0-b3c061bc"], "t_start_unix": 1779879143.2968972, "tp_rank": 0, "t_log_unix": 1779879143.2969005}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-ace77e2b02f9f141-0-b3c061bc"], "duration_s": 0.007019245007541031, "t_start_unix": 1779879143.2968972, "tp_rank": 0, "t_log_unix": 1779879143.3039184}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a4a2366879c68ded-0-8ac4098e"], "t_start_unix": 1779879143.5146625, "tp_rank": 0, "t_log_unix": 1779879143.5146651}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a4a2366879c68ded-0-8ac4098e"], "duration_s": 0.02278437599306926, "t_start_unix": 1779879143.5146625, "tp_rank": 0, "t_log_unix": 1779879143.537448}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8690cafcace0d5e2-0-b89f33d2"], "t_start_unix": 1779879143.6958342, "tp_rank": 0, "t_log_unix": 1779879143.6958375}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8690cafcace0d5e2-0-b89f33d2"], "duration_s": 0.022794076008722186, "t_start_unix": 1779879143.6958342, "tp_rank": 0, "t_log_unix": 1779879143.71863}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b087e2ec4cfa8eb7-0-b908f425"], "t_start_unix": 1779879144.3279662, "tp_rank": 0, "t_log_unix": 1779879144.3279696}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b087e2ec4cfa8eb7-0-b908f425"], "duration_s": 0.08753501297906041, "t_start_unix": 1779879144.3279662, "tp_rank": 0, "t_log_unix": 1779879144.415505}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a115d16ff5575e08-0-9fa81984"], "t_start_unix": 1779879145.040141, "tp_rank": 0, "t_log_unix": 1779879145.0401456}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a115d16ff5575e08-0-9fa81984"], "duration_s": 0.0860149699728936, "t_start_unix": 1779879145.040141, "tp_rank": 0, "t_log_unix": 1779879145.1261594}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9e585ed083951df5-0-b03f812b"], "t_start_unix": 1779879221.7062025, "tp_rank": 0, "t_log_unix": 1779879221.7062056}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9e585ed083951df5-0-b03f812b"], "duration_s": 0.002459956973325461, "t_start_unix": 1779879221.7062025, "tp_rank": 0, "t_log_unix": 1779879221.7086644}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9271d403c044eadd-0-9c3c4639"], "t_start_unix": 1779879221.7826598, "tp_rank": 0, "t_log_unix": 1779879221.782662}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9271d403c044eadd-0-9c3c4639"], "duration_s": 0.0020201010047458112, "t_start_unix": 1779879221.7826598, "tp_rank": 0, "t_log_unix": 1779879221.7846813}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-82a580cefd3e2440-0-a383c3c4"], "t_start_unix": 1779879221.859549, "tp_rank": 0, "t_log_unix": 1779879221.8595514}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-82a580cefd3e2440-0-a383c3c4"], "duration_s": 0.006836243963334709, "t_start_unix": 1779879221.859549, "tp_rank": 0, "t_log_unix": 1779879221.8663864}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a31cb4bc9e7f63d2-0-8f48aacd"], "t_start_unix": 1779879221.9419758, "tp_rank": 0, "t_log_unix": 1779879221.9419782}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a31cb4bc9e7f63d2-0-8f48aacd"], "duration_s": 0.00694335694424808, "t_start_unix": 1779879221.9419758, "tp_rank": 0, "t_log_unix": 1779879221.9489205}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a9dfc1a5b425d994-0-a0930098"], "t_start_unix": 1779879222.0232244, "tp_rank": 0, "t_log_unix": 1779879222.0232272}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a9dfc1a5b425d994-0-a0930098"], "duration_s": 0.006697195000015199, "t_start_unix": 1779879222.0232244, "tp_rank": 0, "t_log_unix": 1779879222.0299227}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9712857755af2efc-0-90b2dc9b"], "t_start_unix": 1779879222.1297998, "tp_rank": 0, "t_log_unix": 1779879222.1298025}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9712857755af2efc-0-90b2dc9b"], "duration_s": 0.01183948403922841, "t_start_unix": 1779879222.1297998, "tp_rank": 0, "t_log_unix": 1779879222.1416407}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b4f0a10dee65acbe-0-a3c132fc"], "t_start_unix": 1779879222.243023, "tp_rank": 0, "t_log_unix": 1779879222.243025}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b4f0a10dee65acbe-0-a3c132fc"], "duration_s": 0.01214482297655195, "t_start_unix": 1779879222.243023, "tp_rank": 0, "t_log_unix": 1779879222.2551687}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b4c514b80b52a3f2-0-bcd24f8e"], "t_start_unix": 1779879222.3569698, "tp_rank": 0, "t_log_unix": 1779879222.356972}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b4c514b80b52a3f2-0-bcd24f8e"], "duration_s": 0.011961110983975232, "t_start_unix": 1779879222.3569698, "tp_rank": 0, "t_log_unix": 1779879222.368932}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-ac7118d8090d181c-0-8af4adf0"], "t_start_unix": 1779879222.4715128, "tp_rank": 0, "t_log_unix": 1779879222.4715152}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-ac7118d8090d181c-0-8af4adf0"], "duration_s": 0.011788576026447117, "t_start_unix": 1779879222.4715128, "tp_rank": 0, "t_log_unix": 1779879222.4833028}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-85291bcb93aaf638-0-868db1a8"], "t_start_unix": 1779879222.5826046, "tp_rank": 0, "t_log_unix": 1779879222.5826073}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-85291bcb93aaf638-0-868db1a8"], "duration_s": 0.0118055299972184, "t_start_unix": 1779879222.5826046, "tp_rank": 0, "t_log_unix": 1779879222.5944116}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a448cf2e059ba0c9-0-a1360796"], "t_start_unix": 1779879222.750828, "tp_rank": 0, "t_log_unix": 1779879222.7508304}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a448cf2e059ba0c9-0-a1360796"], "duration_s": 0.0021119200391694903, "t_start_unix": 1779879222.750828, "tp_rank": 0, "t_log_unix": 1779879222.7529414}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b486fd9e945a4658-0-8bb561cd"], "t_start_unix": 1779879222.913044, "tp_rank": 0, "t_log_unix": 1779879222.9130466}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b486fd9e945a4658-0-8bb561cd"], "duration_s": 0.0022232600022107363, "t_start_unix": 1779879222.913044, "tp_rank": 0, "t_log_unix": 1779879222.9152684}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-82da2bfe65f276c6-0-88d9a9a2"], "t_start_unix": 1779879223.0765986, "tp_rank": 0, "t_log_unix": 1779879223.0766027}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-82da2bfe65f276c6-0-88d9a9a2"], "duration_s": 0.022250515001360327, "t_start_unix": 1779879223.0765986, "tp_rank": 0, "t_log_unix": 1779879223.0988505}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-93bd777652eba5f3-0-9ec3d058"], "t_start_unix": 1779879223.2591784, "tp_rank": 0, "t_log_unix": 1779879223.2591808}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-93bd777652eba5f3-0-9ec3d058"], "duration_s": 0.022157608007546514, "t_start_unix": 1779879223.2591784, "tp_rank": 0, "t_log_unix": 1779879223.2813375}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-81f950480a3cabf9-0-bbf8584f"], "t_start_unix": 1779879223.4402068, "tp_rank": 0, "t_log_unix": 1779879223.440209}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-81f950480a3cabf9-0-bbf8584f"], "duration_s": 0.022589912987314165, "t_start_unix": 1779879223.4402068, "tp_rank": 0, "t_log_unix": 1779879223.4627984}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b109ed06b5882659-0-8d14993c"], "t_start_unix": 1779879223.7529812, "tp_rank": 0, "t_log_unix": 1779879223.752984}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b109ed06b5882659-0-8d14993c"], "duration_s": 0.043345845013391227, "t_start_unix": 1779879223.7529812, "tp_rank": 0, "t_log_unix": 1779879223.796329}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8a57776c81d64b2c-0-ace8fb2b"], "t_start_unix": 1779879224.0899644, "tp_rank": 0, "t_log_unix": 1779879224.0899673}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8a57776c81d64b2c-0-ace8fb2b"], "duration_s": 0.04341953102266416, "t_start_unix": 1779879224.0899644, "tp_rank": 0, "t_log_unix": 1779879224.1333857}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9b1a5dce18758450-0-b17b3649"], "t_start_unix": 1779879224.424807, "tp_rank": 0, "t_log_unix": 1779879224.42481}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9b1a5dce18758450-0-b17b3649"], "duration_s": 0.04336977802449837, "t_start_unix": 1779879224.424807, "tp_rank": 0, "t_log_unix": 1779879224.4681823}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8c7d412b85f43ed7-0-9dea4add"], "t_start_unix": 1779879224.7599711, "tp_rank": 0, "t_log_unix": 1779879224.7599735}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8c7d412b85f43ed7-0-9dea4add"], "duration_s": 0.043769759009592235, "t_start_unix": 1779879224.7599711, "tp_rank": 0, "t_log_unix": 1779879224.8037443}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8860308db3f010a5-0-ad51eb46"], "t_start_unix": 1779879225.0962389, "tp_rank": 0, "t_log_unix": 1779879225.0962446}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8860308db3f010a5-0-ad51eb46"], "duration_s": 0.043612666020635515, "t_start_unix": 1779879225.0962389, "tp_rank": 0, "t_log_unix": 1779879225.1398532}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-86cca1a2b9427801-0-ba41ade7"], "t_start_unix": 1779879225.7592747, "tp_rank": 0, "t_log_unix": 1779879225.759278}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-86cca1a2b9427801-0-ba41ade7"], "duration_s": 0.002386144013144076, "t_start_unix": 1779879225.7592747, "tp_rank": 0, "t_log_unix": 1779879225.7616625}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a208c6d804293be7-0-94d265ab"], "t_start_unix": 1779879226.384918, "tp_rank": 0, "t_log_unix": 1779879226.384921}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a208c6d804293be7-0-94d265ab"], "duration_s": 0.0023903060355223715, "t_start_unix": 1779879226.384918, "tp_rank": 0, "t_log_unix": 1779879226.3873098}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b53bea2317cc1211-0-8fcad8a8"], "t_start_unix": 1779879227.0092332, "tp_rank": 0, "t_log_unix": 1779879227.0092363}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b53bea2317cc1211-0-8fcad8a8"], "duration_s": 0.08524628396844491, "t_start_unix": 1779879227.0092332, "tp_rank": 0, "t_log_unix": 1779879227.094482}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9daf909593bbdf03-0-8fd7d50e"], "t_start_unix": 1779879227.7190688, "tp_rank": 0, "t_log_unix": 1779879227.719072}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9daf909593bbdf03-0-8fd7d50e"], "duration_s": 0.08596085698809475, "t_start_unix": 1779879227.7190688, "tp_rank": 0, "t_log_unix": 1779879227.8050315}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9ef40f3b6d736128-0-8e8e1c30"], "t_start_unix": 1779879228.4297745, "tp_rank": 0, "t_log_unix": 1779879228.4297774}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9ef40f3b6d736128-0-8e8e1c30"], "duration_s": 0.0860762019874528, "t_start_unix": 1779879228.4297745, "tp_rank": 0, "t_log_unix": 1779879228.5158527}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-851e5d7e3e83d7ea-0-a66a5e0b"], "t_start_unix": 1779879230.131392, "tp_rank": 0, "t_log_unix": 1779879230.1313956}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-851e5d7e3e83d7ea-0-a66a5e0b"], "duration_s": 0.1721468890318647, "t_start_unix": 1779879230.131392, "tp_rank": 0, "t_log_unix": 1779879230.3035412}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9be12af6a9ccccf5-0-af1230c7"], "t_start_unix": 1779879231.896075, "tp_rank": 0, "t_log_unix": 1779879231.896078}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9be12af6a9ccccf5-0-af1230c7"], "duration_s": 0.16974544001277536, "t_start_unix": 1779879231.896075, "tp_rank": 0, "t_log_unix": 1779879232.0658224}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b61b9b237366297b-0-9832f0e3"], "t_start_unix": 1779879233.6589305, "tp_rank": 0, "t_log_unix": 1779879233.6589334}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-b61b9b237366297b-0-9832f0e3"], "duration_s": 0.16975757898762822, "t_start_unix": 1779879233.6589305, "tp_rank": 0, "t_log_unix": 1779879233.8286898}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-bae0d0efe47ece8f-0-affbc685"], "t_start_unix": 1779879235.4181106, "tp_rank": 0, "t_log_unix": 1779879235.418114}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-bae0d0efe47ece8f-0-affbc685"], "duration_s": 0.1695251659839414, "t_start_unix": 1779879235.4181106, "tp_rank": 0, "t_log_unix": 1779879235.587638}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a34bc73c9cd2efc1-0-90d647fc"], "t_start_unix": 1779879237.1803744, "tp_rank": 0, "t_log_unix": 1779879237.1803775}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a34bc73c9cd2efc1-0-90d647fc"], "duration_s": 0.16962904302636161, "t_start_unix": 1779879237.1803744, "tp_rank": 0, "t_log_unix": 1779879237.3500054}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-89a36c12ee6b0ff3-0-9fddbc0f"], "t_start_unix": 1779879241.9859307, "tp_rank": 0, "t_log_unix": 1779879241.9859338}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-89a36c12ee6b0ff3-0-9fddbc0f"], "duration_s": 0.32203804596792907, "t_start_unix": 1779879241.9859307, "tp_rank": 0, "t_log_unix": 1779879242.3079708}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8d65512eb7e3c36c-0-8b23597c"], "t_start_unix": 1779879246.9755645, "tp_rank": 0, "t_log_unix": 1779879246.9755676}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-8d65512eb7e3c36c-0-8b23597c"], "duration_s": 0.3227974839974195, "t_start_unix": 1779879246.9755645, "tp_rank": 0, "t_log_unix": 1779879247.2983644}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a13c271ecbbca78b-0-b76a0370"], "t_start_unix": 1779879251.9618897, "tp_rank": 0, "t_log_unix": 1779879251.961893}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a13c271ecbbca78b-0-b76a0370"], "duration_s": 0.3240378479822539, "t_start_unix": 1779879251.9618897, "tp_rank": 0, "t_log_unix": 1779879252.2859304}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-bada04ec8c556aca-0-a263d637"], "t_start_unix": 1779879256.9512377, "tp_rank": 0, "t_log_unix": 1779879256.9512408}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-bada04ec8c556aca-0-a263d637"], "duration_s": 0.5924434679909609, "t_start_unix": 1779879256.9512377, "tp_rank": 0, "t_log_unix": 1779879257.5436878}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9641a077022e6123-0-8c3c0975"], "t_start_unix": 1779879262.2127163, "tp_rank": 0, "t_log_unix": 1779879262.2127194}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9641a077022e6123-0-8c3c0975"], "duration_s": 0.5644763479940593, "t_start_unix": 1779879262.2127163, "tp_rank": 0, "t_log_unix": 1779879262.777195}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-bb3a4e5084af8c3a-0-bdfa0931"], "t_start_unix": 1779879278.1063075, "tp_rank": 0, "t_log_unix": 1779879278.106311}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-bb3a4e5084af8c3a-0-bdfa0931"], "duration_s": 2.0784930550144054, "t_start_unix": 1779879278.1063075, "tp_rank": 0, "t_log_unix": 1779879280.1848085}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-91b951f85c93a71b-0-8396bee5"], "t_start_unix": 1779879295.600993, "tp_rank": 0, "t_log_unix": 1779879295.6009963}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-91b951f85c93a71b-0-8396bee5"], "duration_s": 2.2067435560165904, "t_start_unix": 1779879295.600993, "tp_rank": 0, "t_log_unix": 1779879297.8077443}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-81d236ecb6aadadf-0-ac184d51"], "t_start_unix": 1779879313.2315958, "tp_rank": 0, "t_log_unix": 1779879313.2315989}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-81d236ecb6aadadf-0-ac184d51"], "duration_s": 1.9879729640088044, "t_start_unix": 1779879313.2315958, "tp_rank": 0, "t_log_unix": 1779879315.219571}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a4c76c62b44c4295-0-b007a6ed"], "t_start_unix": 1779879330.6154163, "tp_rank": 0, "t_log_unix": 1779879330.6154196}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a4c76c62b44c4295-0-b007a6ed"], "duration_s": 0.9849357060156763, "t_start_unix": 1779879330.6154163, "tp_rank": 0, "t_log_unix": 1779879331.6003594}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a06d4b774a8af9a5-0-980e9d23"], "t_start_unix": 1779879346.990221, "tp_rank": 0, "t_log_unix": 1779879346.9902246}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a06d4b774a8af9a5-0-980e9d23"], "duration_s": 0.6725030990201049, "t_start_unix": 1779879346.990221, "tp_rank": 0, "t_log_unix": 1779879347.6627269}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-bf0d435e06e3349f-0-8507c933"], "t_start_unix": 1779879402.7123013, "tp_rank": 0, "t_log_unix": 1779879402.7123044}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-bf0d435e06e3349f-0-8507c933"], "duration_s": 1.3384539679973386, "t_start_unix": 1779879402.7123013, "tp_rank": 0, "t_log_unix": 1779879404.0507588}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9f87ae0fb0c7eec8-0-a8a1daea"], "t_start_unix": 1779879458.9232886, "tp_rank": 0, "t_log_unix": 1779879458.9232917}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9f87ae0fb0c7eec8-0-a8a1daea"], "duration_s": 5.973284716019407, "t_start_unix": 1779879458.9232886, "tp_rank": 0, "t_log_unix": 1779879464.896582}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a62e48e40e6c6ad7-0-acca9741"], "t_start_unix": 1779879519.7647448, "tp_rank": 0, "t_log_unix": 1779879519.7647479}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-a62e48e40e6c6ad7-0-acca9741"], "duration_s": 10.056511385017075, "t_start_unix": 1779879519.7647448, "tp_rank": 0, "t_log_unix": 1779879529.8212643}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-824479d53bab40e4-0-af951a11"], "t_start_unix": 1779879584.888362, "tp_rank": 0, "t_log_unix": 1779879584.8883653}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-824479d53bab40e4-0-af951a11"], "duration_s": 2.925714804965537, "t_start_unix": 1779879584.888362, "tp_rank": 0, "t_log_unix": 1779879587.814085}
{"event": "receive_kv_enter", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9f06f19c981c0b3f-0-b3afb370"], "t_start_unix": 1779879642.6076336, "tp_rank": 0, "t_log_unix": 1779879642.6076367}
{"event": "receive_kv_finish", "worker_addr": "tcp://172.27.123.142:44435", "req_ids": ["cmpl-9f06f19c981c0b3f-0-b3afb370"], "duration_s": 1.5183607729850337, "t_start_unix": 1779879642.6076336, "tp_rank": 0, "t_log_unix": 1779879644.1259985}

View File

@@ -0,0 +1,758 @@
{
"rows": [
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.023202952987048775,
"rx_total_s": 0.03333390498301014,
"rx_overhead_s": 0.010130951995961368,
"rx_t_start_unix": 1779879143.1678784,
"send_t_start_unix": 1779879143.174031,
"req_ids": [
"cmpl-ad00672f263a6643-0-9479211a"
]
},
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.005375694017857313,
"rx_total_s": 0.007019245007541031,
"rx_overhead_s": 0.0016435509896837175,
"rx_t_start_unix": 1779879143.2968972,
"send_t_start_unix": 1779879143.2982283,
"req_ids": [
"cmpl-ace77e2b02f9f141-0-b3c061bc"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.021170366962905973,
"rx_total_s": 0.02278437599306926,
"rx_overhead_s": 0.0016140090301632881,
"rx_t_start_unix": 1779879143.5146625,
"send_t_start_unix": 1779879143.5159554,
"req_ids": [
"cmpl-a4a2366879c68ded-0-8ac4098e"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.020726953051052988,
"rx_total_s": 0.022794076008722186,
"rx_overhead_s": 0.0020671229576691985,
"rx_t_start_unix": 1779879143.6958342,
"send_t_start_unix": 1779879143.6974514,
"req_ids": [
"cmpl-8690cafcace0d5e2-0-b89f33d2"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08536655298667029,
"rx_total_s": 0.08753501297906041,
"rx_overhead_s": 0.002168459992390126,
"rx_t_start_unix": 1779879144.3279662,
"send_t_start_unix": 1779879144.3294952,
"req_ids": [
"cmpl-b087e2ec4cfa8eb7-0-b908f425"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08367906499188393,
"rx_total_s": 0.0860149699728936,
"rx_overhead_s": 0.002335904981009662,
"rx_t_start_unix": 1779879145.040141,
"send_t_start_unix": 1779879145.0419943,
"req_ids": [
"cmpl-a115d16ff5575e08-0-9fa81984"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.0004059679922647774,
"rx_total_s": 0.002459956973325461,
"rx_overhead_s": 0.0020539889810606837,
"rx_t_start_unix": 1779879221.7062025,
"send_t_start_unix": 1779879221.7078288,
"req_ids": [
"cmpl-9e585ed083951df5-0-b03f812b"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.000346789020113647,
"rx_total_s": 0.0020201010047458112,
"rx_overhead_s": 0.0016733119846321642,
"rx_t_start_unix": 1779879221.7826598,
"send_t_start_unix": 1779879221.7838593,
"req_ids": [
"cmpl-9271d403c044eadd-0-9c3c4639"
]
},
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.005353622022084892,
"rx_total_s": 0.006836243963334709,
"rx_overhead_s": 0.0014826219412498176,
"rx_t_start_unix": 1779879221.859549,
"send_t_start_unix": 1779879221.8607252,
"req_ids": [
"cmpl-82a580cefd3e2440-0-a383c3c4"
]
},
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.005279594974126667,
"rx_total_s": 0.00694335694424808,
"rx_overhead_s": 0.0016637619701214135,
"rx_t_start_unix": 1779879221.9419758,
"send_t_start_unix": 1779879221.9432015,
"req_ids": [
"cmpl-a31cb4bc9e7f63d2-0-8f48aacd"
]
},
{
"input_tokens_est": 512,
"total_bytes": 50331648,
"pure_transfer_s": 0.0053006180096417665,
"rx_total_s": 0.006697195000015199,
"rx_overhead_s": 0.0013965769903734326,
"rx_t_start_unix": 1779879222.0232244,
"send_t_start_unix": 1779879222.0243337,
"req_ids": [
"cmpl-a9dfc1a5b425d994-0-a0930098"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010396577999927104,
"rx_total_s": 0.01183948403922841,
"rx_overhead_s": 0.001442906039301306,
"rx_t_start_unix": 1779879222.1297998,
"send_t_start_unix": 1779879222.130936,
"req_ids": [
"cmpl-9712857755af2efc-0-90b2dc9b"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010438029014039785,
"rx_total_s": 0.01214482297655195,
"rx_overhead_s": 0.0017067939625121653,
"rx_t_start_unix": 1779879222.243023,
"send_t_start_unix": 1779879222.2442062,
"req_ids": [
"cmpl-b4f0a10dee65acbe-0-a3c132fc"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010436972021125257,
"rx_total_s": 0.011961110983975232,
"rx_overhead_s": 0.0015241389628499746,
"rx_t_start_unix": 1779879222.3569698,
"send_t_start_unix": 1779879222.3581295,
"req_ids": [
"cmpl-b4c514b80b52a3f2-0-bcd24f8e"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010396371013484895,
"rx_total_s": 0.011788576026447117,
"rx_overhead_s": 0.001392205012962222,
"rx_t_start_unix": 1779879222.4715128,
"send_t_start_unix": 1779879222.4725878,
"req_ids": [
"cmpl-ac7118d8090d181c-0-8af4adf0"
]
},
{
"input_tokens_est": 1024,
"total_bytes": 100663296,
"pure_transfer_s": 0.010352785000577569,
"rx_total_s": 0.0118055299972184,
"rx_overhead_s": 0.0014527449966408312,
"rx_t_start_unix": 1779879222.5826046,
"send_t_start_unix": 1779879222.5837166,
"req_ids": [
"cmpl-85291bcb93aaf638-0-868db1a8"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.00034007197245955467,
"rx_total_s": 0.0021119200391694903,
"rx_overhead_s": 0.0017718480667099357,
"rx_t_start_unix": 1779879222.750828,
"send_t_start_unix": 1779879222.7521152,
"req_ids": [
"cmpl-a448cf2e059ba0c9-0-a1360796"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.00041691696969792247,
"rx_total_s": 0.0022232600022107363,
"rx_overhead_s": 0.0018063430325128138,
"rx_t_start_unix": 1779879222.913044,
"send_t_start_unix": 1779879222.9143836,
"req_ids": [
"cmpl-b486fd9e945a4658-0-8bb561cd"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.020633380976505578,
"rx_total_s": 0.022250515001360327,
"rx_overhead_s": 0.0016171340248547494,
"rx_t_start_unix": 1779879223.0765986,
"send_t_start_unix": 1779879223.0778644,
"req_ids": [
"cmpl-82da2bfe65f276c6-0-88d9a9a2"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.020639199996367097,
"rx_total_s": 0.022157608007546514,
"rx_overhead_s": 0.0015184080111794174,
"rx_t_start_unix": 1779879223.2591784,
"send_t_start_unix": 1779879223.2603853,
"req_ids": [
"cmpl-93bd777652eba5f3-0-9ec3d058"
]
},
{
"input_tokens_est": 2048,
"total_bytes": 201326592,
"pure_transfer_s": 0.020575353992171586,
"rx_total_s": 0.022589912987314165,
"rx_overhead_s": 0.002014558995142579,
"rx_t_start_unix": 1779879223.4402068,
"send_t_start_unix": 1779879223.4418828,
"req_ids": [
"cmpl-81f950480a3cabf9-0-bbf8584f"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.041439525957684964,
"rx_total_s": 0.043345845013391227,
"rx_overhead_s": 0.0019063190557062626,
"rx_t_start_unix": 1779879223.7529812,
"send_t_start_unix": 1779879223.7544343,
"req_ids": [
"cmpl-b109ed06b5882659-0-8d14993c"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.04152030003024265,
"rx_total_s": 0.04341953102266416,
"rx_overhead_s": 0.0018992309924215078,
"rx_t_start_unix": 1779879224.0899644,
"send_t_start_unix": 1779879224.0914912,
"req_ids": [
"cmpl-8a57776c81d64b2c-0-ace8fb2b"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.04148670402355492,
"rx_total_s": 0.04336977802449837,
"rx_overhead_s": 0.0018830740009434521,
"rx_t_start_unix": 1779879224.424807,
"send_t_start_unix": 1779879224.4262393,
"req_ids": [
"cmpl-9b1a5dce18758450-0-b17b3649"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.04146742797456682,
"rx_total_s": 0.043769759009592235,
"rx_overhead_s": 0.002302331035025418,
"rx_t_start_unix": 1779879224.7599711,
"send_t_start_unix": 1779879224.7617002,
"req_ids": [
"cmpl-8c7d412b85f43ed7-0-9dea4add"
]
},
{
"input_tokens_est": 4096,
"total_bytes": 402653184,
"pure_transfer_s": 0.04143296502297744,
"rx_total_s": 0.043612666020635515,
"rx_overhead_s": 0.002179700997658074,
"rx_t_start_unix": 1779879225.0962389,
"send_t_start_unix": 1779879225.0978234,
"req_ids": [
"cmpl-8860308db3f010a5-0-ad51eb46"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.0003991159610450268,
"rx_total_s": 0.002386144013144076,
"rx_overhead_s": 0.001987028052099049,
"rx_t_start_unix": 1779879225.7592747,
"send_t_start_unix": 1779879225.760789,
"req_ids": [
"cmpl-86cca1a2b9427801-0-ba41ade7"
]
},
{
"input_tokens_est": 16,
"total_bytes": 1572864,
"pure_transfer_s": 0.00041423802031204104,
"rx_total_s": 0.0023903060355223715,
"rx_overhead_s": 0.0019760680152103305,
"rx_t_start_unix": 1779879226.384918,
"send_t_start_unix": 1779879226.3864496,
"req_ids": [
"cmpl-a208c6d804293be7-0-94d265ab"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08309489500243217,
"rx_total_s": 0.08524628396844491,
"rx_overhead_s": 0.002151388966012746,
"rx_t_start_unix": 1779879227.0092332,
"send_t_start_unix": 1779879227.0107942,
"req_ids": [
"cmpl-b53bea2317cc1211-0-8fcad8a8"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08372796402545646,
"rx_total_s": 0.08596085698809475,
"rx_overhead_s": 0.0022328929626382887,
"rx_t_start_unix": 1779879227.7190688,
"send_t_start_unix": 1779879227.7207224,
"req_ids": [
"cmpl-9daf909593bbdf03-0-8fd7d50e"
]
},
{
"input_tokens_est": 8192,
"total_bytes": 805306368,
"pure_transfer_s": 0.08398396399570629,
"rx_total_s": 0.0860762019874528,
"rx_overhead_s": 0.002092237991746515,
"rx_t_start_unix": 1779879228.4297745,
"send_t_start_unix": 1779879228.4314566,
"req_ids": [
"cmpl-9ef40f3b6d736128-0-8e8e1c30"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.16950496198842302,
"rx_total_s": 0.1721468890318647,
"rx_overhead_s": 0.002641927043441683,
"rx_t_start_unix": 1779879230.131392,
"send_t_start_unix": 1779879230.1334376,
"req_ids": [
"cmpl-851e5d7e3e83d7ea-0-a66a5e0b"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.16713789198547602,
"rx_total_s": 0.16974544001277536,
"rx_overhead_s": 0.0026075480272993445,
"rx_t_start_unix": 1779879231.896075,
"send_t_start_unix": 1779879231.8981037,
"req_ids": [
"cmpl-9be12af6a9ccccf5-0-af1230c7"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.16713115200400352,
"rx_total_s": 0.16975757898762822,
"rx_overhead_s": 0.0026264269836246967,
"rx_t_start_unix": 1779879233.6589305,
"send_t_start_unix": 1779879233.6608078,
"req_ids": [
"cmpl-b61b9b237366297b-0-9832f0e3"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.16709016199456528,
"rx_total_s": 0.1695251659839414,
"rx_overhead_s": 0.0024350039893761277,
"rx_t_start_unix": 1779879235.4181106,
"send_t_start_unix": 1779879235.419875,
"req_ids": [
"cmpl-bae0d0efe47ece8f-0-affbc685"
]
},
{
"input_tokens_est": 16384,
"total_bytes": 1610612736,
"pure_transfer_s": 0.166486973001156,
"rx_total_s": 0.16962904302636161,
"rx_overhead_s": 0.003142070025205612,
"rx_t_start_unix": 1779879237.1803744,
"send_t_start_unix": 1779879237.1821773,
"req_ids": [
"cmpl-a34bc73c9cd2efc1-0-90d647fc"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.31926770601421595,
"rx_total_s": 0.32203804596792907,
"rx_overhead_s": 0.002770339953713119,
"rx_t_start_unix": 1779879241.9859307,
"send_t_start_unix": 1779879241.9880297,
"req_ids": [
"cmpl-89a36c12ee6b0ff3-0-9fddbc0f"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.3197040680097416,
"rx_total_s": 0.3227974839974195,
"rx_overhead_s": 0.003093415987677872,
"rx_t_start_unix": 1779879246.9755645,
"send_t_start_unix": 1779879246.9779432,
"req_ids": [
"cmpl-8d65512eb7e3c36c-0-8b23597c"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.32088329299585894,
"rx_total_s": 0.3240378479822539,
"rx_overhead_s": 0.003154554986394942,
"rx_t_start_unix": 1779879251.9618897,
"send_t_start_unix": 1779879251.9643052,
"req_ids": [
"cmpl-a13c271ecbbca78b-0-b76a0370"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.5439103110111319,
"rx_total_s": 0.5924434679909609,
"rx_overhead_s": 0.04853315697982907,
"rx_t_start_unix": 1779879256.9512377,
"send_t_start_unix": 1779879256.9989722,
"req_ids": [
"cmpl-bada04ec8c556aca-0-a263d637"
]
},
{
"input_tokens_est": 32768,
"total_bytes": 3221225472,
"pure_transfer_s": 0.5193864739849232,
"rx_total_s": 0.5644763479940593,
"rx_overhead_s": 0.04508987400913611,
"rx_t_start_unix": 1779879262.2127163,
"send_t_start_unix": 1779879262.2562187,
"req_ids": [
"cmpl-9641a077022e6123-0-8c3c0975"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 1.9844180009677075,
"rx_total_s": 2.0784930550144054,
"rx_overhead_s": 0.09407505404669791,
"rx_t_start_unix": 1779879278.1063075,
"send_t_start_unix": 1779879278.199048,
"req_ids": [
"cmpl-bb3a4e5084af8c3a-0-bdfa0931"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 2.1099297259934247,
"rx_total_s": 2.2067435560165904,
"rx_overhead_s": 0.09681383002316579,
"rx_t_start_unix": 1779879295.600993,
"send_t_start_unix": 1779879295.6967168,
"req_ids": [
"cmpl-91b951f85c93a71b-0-8396bee5"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 1.8950715209939517,
"rx_total_s": 1.9879729640088044,
"rx_overhead_s": 0.0929014430148527,
"rx_t_start_unix": 1779879313.2315958,
"send_t_start_unix": 1779879313.3236735,
"req_ids": [
"cmpl-81d236ecb6aadadf-0-ac184d51"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 0.9277855920372531,
"rx_total_s": 0.9849357060156763,
"rx_overhead_s": 0.05715011397842318,
"rx_t_start_unix": 1779879330.6154163,
"send_t_start_unix": 1779879330.6715357,
"req_ids": [
"cmpl-a4c76c62b44c4295-0-b007a6ed"
]
},
{
"input_tokens_est": 65536,
"total_bytes": 6442450944,
"pure_transfer_s": 0.6652462020283565,
"rx_total_s": 0.6725030990201049,
"rx_overhead_s": 0.007256896991748363,
"rx_t_start_unix": 1779879346.990221,
"send_t_start_unix": 1779879346.9950044,
"req_ids": [
"cmpl-a06d4b774a8af9a5-0-980e9d23"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 1.3330365709844045,
"rx_total_s": 1.3384539679973386,
"rx_overhead_s": 0.005417397012934089,
"rx_t_start_unix": 1779879402.7123013,
"send_t_start_unix": 1779879402.7169023,
"req_ids": [
"cmpl-bf0d435e06e3349f-0-8507c933"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 5.839069904992357,
"rx_total_s": 5.973284716019407,
"rx_overhead_s": 0.13421481102705002,
"rx_t_start_unix": 1779879458.9232886,
"send_t_start_unix": 1779879459.0566247,
"req_ids": [
"cmpl-9f87ae0fb0c7eec8-0-a8a1daea"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 9.862486142024864,
"rx_total_s": 10.056511385017075,
"rx_overhead_s": 0.19402524299221113,
"rx_t_start_unix": 1779879519.7647448,
"send_t_start_unix": 1779879519.9567635,
"req_ids": [
"cmpl-a62e48e40e6c6ad7-0-acca9741"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 2.8350498770014383,
"rx_total_s": 2.925714804965537,
"rx_overhead_s": 0.09066492796409875,
"rx_t_start_unix": 1779879584.888362,
"send_t_start_unix": 1779879584.9780834,
"req_ids": [
"cmpl-824479d53bab40e4-0-af951a11"
]
},
{
"input_tokens_est": 131072,
"total_bytes": 12884901888,
"pure_transfer_s": 1.485496642999351,
"rx_total_s": 1.5183607729850337,
"rx_overhead_s": 0.032864129985682666,
"rx_t_start_unix": 1779879642.6076336,
"send_t_start_unix": 1779879642.639775,
"req_ids": [
"cmpl-9f06f19c981c0b3f-0-b3afb370"
]
}
],
"summary": [
{
"input_tokens": 16,
"kv_mib": 1.5,
"n": 6,
"pure_transfer_ms_mean": 0.39,
"pure_transfer_ms_p50": 0.4,
"pure_transfer_ms_max": 0.42,
"pure_transfer_ms_min": 0.34,
"rx_total_ms_mean": 2.27,
"rx_overhead_ms_mean": 1.88,
"throughput_gbps_mean": 4.09,
"throughput_gbps_p50": 3.91,
"throughput_gbps_max": 4.63
},
{
"input_tokens": 512,
"kv_mib": 48.0,
"n": 5,
"pure_transfer_ms_mean": 8.9,
"pure_transfer_ms_p50": 5.35,
"pure_transfer_ms_max": 23.2,
"pure_transfer_ms_min": 5.28,
"rx_total_ms_mean": 12.17,
"rx_overhead_ms_mean": 3.26,
"throughput_gbps_mean": 7.99,
"throughput_gbps_p50": 9.4,
"throughput_gbps_max": 9.53
},
{
"input_tokens": 1024,
"kv_mib": 96.0,
"n": 5,
"pure_transfer_ms_mean": 10.4,
"pure_transfer_ms_p50": 10.4,
"pure_transfer_ms_max": 10.44,
"pure_transfer_ms_min": 10.35,
"rx_total_ms_mean": 11.91,
"rx_overhead_ms_mean": 1.5,
"throughput_gbps_mean": 9.68,
"throughput_gbps_p50": 9.68,
"throughput_gbps_max": 9.72
},
{
"input_tokens": 2048,
"kv_mib": 192.0,
"n": 5,
"pure_transfer_ms_mean": 20.75,
"pure_transfer_ms_p50": 20.64,
"pure_transfer_ms_max": 21.17,
"pure_transfer_ms_min": 20.58,
"rx_total_ms_mean": 22.52,
"rx_overhead_ms_mean": 1.77,
"throughput_gbps_mean": 9.7,
"throughput_gbps_p50": 9.75,
"throughput_gbps_max": 9.78
},
{
"input_tokens": 4096,
"kv_mib": 384.0,
"n": 5,
"pure_transfer_ms_mean": 41.47,
"pure_transfer_ms_p50": 41.47,
"pure_transfer_ms_max": 41.52,
"pure_transfer_ms_min": 41.43,
"rx_total_ms_mean": 43.5,
"rx_overhead_ms_mean": 2.03,
"throughput_gbps_mean": 9.71,
"throughput_gbps_p50": 9.71,
"throughput_gbps_max": 9.72
},
{
"input_tokens": 8192,
"kv_mib": 768.0,
"n": 5,
"pure_transfer_ms_mean": 83.97,
"pure_transfer_ms_p50": 83.73,
"pure_transfer_ms_max": 85.37,
"pure_transfer_ms_min": 83.09,
"rx_total_ms_mean": 86.17,
"rx_overhead_ms_mean": 2.2,
"throughput_gbps_mean": 9.59,
"throughput_gbps_p50": 9.62,
"throughput_gbps_max": 9.69
},
{
"input_tokens": 16384,
"kv_mib": 1536.0,
"n": 5,
"pure_transfer_ms_mean": 167.47,
"pure_transfer_ms_p50": 167.13,
"pure_transfer_ms_max": 169.5,
"pure_transfer_ms_min": 166.49,
"rx_total_ms_mean": 170.16,
"rx_overhead_ms_mean": 2.69,
"throughput_gbps_mean": 9.62,
"throughput_gbps_p50": 9.64,
"throughput_gbps_max": 9.67
},
{
"input_tokens": 32768,
"kv_mib": 3072.0,
"n": 5,
"pure_transfer_ms_mean": 404.63,
"pure_transfer_ms_p50": 320.88,
"pure_transfer_ms_max": 543.91,
"pure_transfer_ms_min": 319.27,
"rx_total_ms_mean": 425.16,
"rx_overhead_ms_mean": 20.53,
"throughput_gbps_mean": 8.47,
"throughput_gbps_p50": 10.04,
"throughput_gbps_max": 10.09
},
{
"input_tokens": 65536,
"kv_mib": 6144.0,
"n": 5,
"pure_transfer_ms_mean": 1516.49,
"pure_transfer_ms_p50": 1895.07,
"pure_transfer_ms_max": 2109.93,
"pure_transfer_ms_min": 665.25,
"rx_total_ms_mean": 1586.13,
"rx_overhead_ms_mean": 69.64,
"throughput_gbps_mean": 5.27,
"throughput_gbps_p50": 3.4,
"throughput_gbps_max": 9.68
},
{
"input_tokens": 131072,
"kv_mib": 12288.0,
"n": 5,
"pure_transfer_ms_mean": 4271.03,
"pure_transfer_ms_p50": 2835.05,
"pure_transfer_ms_max": 9862.49,
"pure_transfer_ms_min": 1333.04,
"rx_total_ms_mean": 4362.47,
"rx_overhead_ms_mean": 91.44,
"throughput_gbps_mean": 5.28,
"throughput_gbps_p50": 4.54,
"throughput_gbps_max": 9.67
}
]
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 92 KiB

View File

@@ -0,0 +1,178 @@
#!/usr/bin/env python3
"""Decompose MB2 transfer events into the per-stage breakdown.
Inputs:
--a-log P-side jsonl with `send_blocks` events
{event=send_blocks, total_bytes, duration_s, t_start_unix, ...}
--b-log D-side jsonl with `receive_kv_enter` and `receive_kv_finish` events
{event=receive_kv_*, t_start_unix, duration_s (on finish), req_ids}
Pairing: each B receive_kv_enter is followed (in time order) by exactly one
receive_kv_finish for the same req_ids set. The send_blocks event on A whose
t_start_unix falls strictly between enter.t_start_unix and
enter.t_start_unix + finish.duration_s is the pair-matched transfer.
Output:
per-(input_tokens) summary printed to stdout
--out JSON with full table + per-size aggregates
Per-stage breakdown (paper-grade vocabulary):
pure_transfer = send_blocks.duration_s
Network data movement: batch_transfer_sync_write wall-time on P.
rx_total = receive_kv_finish.duration_s
Total time on D from receive_kv() entry to receiving FINISH from P.
Includes ZMQ round-trip + P-side processing + pure_transfer.
rx_overhead = rx_total pure_transfer
ZMQ handshake + P-side scheduling/setup time.
We do NOT report queueing or B-side post-transfer decode here — those
require correlation with client-side t_step2 timestamps. This script
operates on log files alone.
"""
from __future__ import annotations
import argparse
import json
import statistics
from pathlib import Path
def load_events(path: Path) -> list[dict]:
rows = []
with path.open() as f:
for line in f:
try:
rows.append(json.loads(line))
except json.JSONDecodeError:
continue
return rows
def pair_b_events(b_events: list[dict]) -> list[dict]:
"""Pair receive_kv_enter with the matching receive_kv_finish (by req_ids)."""
open_by_key: dict[tuple, dict] = {}
paired = []
for e in b_events:
key = tuple(sorted(e.get("req_ids", [])))
if e["event"] == "receive_kv_enter":
open_by_key[key] = e
elif e["event"] == "receive_kv_finish":
enter = open_by_key.pop(key, None)
if enter is None:
continue
paired.append({
"req_ids": list(key),
"rx_t_start_unix": enter["t_start_unix"],
"rx_duration_s": e["duration_s"],
"rx_t_end_unix": enter["t_start_unix"] + e["duration_s"],
"tp_rank": e.get("tp_rank"),
})
return paired
def match_a_to_b(a_events: list[dict], b_pairs: list[dict]) -> list[dict]:
"""For each B pair, find the A send_blocks event whose t_start_unix is
strictly within [rx_t_start, rx_t_end]. Returns merged rows."""
a_by_t = sorted(
(e for e in a_events if e["event"] == "send_blocks"),
key=lambda e: e["t_start_unix"],
)
merged = []
j = 0
for p in b_pairs:
lo = p["rx_t_start_unix"]
hi = p["rx_t_end_unix"]
found = None
# advance j to the first A event in window
while j < len(a_by_t) and a_by_t[j]["t_start_unix"] < lo:
j += 1
if j < len(a_by_t):
a = a_by_t[j]
if a["t_start_unix"] <= hi:
found = a
j += 1
if found is None:
continue
kv_bytes = found["total_bytes"]
merged.append({
"input_tokens_est": kv_bytes // 98304,
"total_bytes": kv_bytes,
"pure_transfer_s": found["duration_s"],
"rx_total_s": p["rx_duration_s"],
"rx_overhead_s": max(0.0, p["rx_duration_s"] - found["duration_s"]),
"rx_t_start_unix": p["rx_t_start_unix"],
"send_t_start_unix": found["t_start_unix"],
"req_ids": p["req_ids"],
})
return merged
def aggregate(rows: list[dict]) -> list[dict]:
by_size: dict[int, list[dict]] = {}
for r in rows:
by_size.setdefault(r["input_tokens_est"], []).append(r)
summary = []
for size in sorted(by_size):
rs = by_size[size]
pts = [r["pure_transfer_s"] for r in rs]
rxs = [r["rx_total_s"] for r in rs]
ovs = [r["rx_overhead_s"] for r in rs]
size_bytes = rs[0]["total_bytes"]
size_mib = size_bytes / (1024 * 1024)
bw = [size_bytes / p / 1e9 for p in pts] # GB/s
summary.append({
"input_tokens": size,
"kv_mib": round(size_mib, 1),
"n": len(rs),
"pure_transfer_ms_mean": round(statistics.mean(pts) * 1000, 2),
"pure_transfer_ms_p50": round(statistics.median(pts) * 1000, 2),
"pure_transfer_ms_max": round(max(pts) * 1000, 2),
"pure_transfer_ms_min": round(min(pts) * 1000, 2),
"rx_total_ms_mean": round(statistics.mean(rxs) * 1000, 2),
"rx_overhead_ms_mean": round(statistics.mean(ovs) * 1000, 2),
"throughput_gbps_mean": round(statistics.mean(bw), 2),
"throughput_gbps_p50": round(statistics.median(bw), 2),
"throughput_gbps_max": round(max(bw), 2),
})
return summary
def main() -> None:
p = argparse.ArgumentParser()
p.add_argument("--a-log", type=Path, required=True)
p.add_argument("--b-log", type=Path, required=True)
p.add_argument("--out", type=Path, default=None)
args = p.parse_args()
a_events = load_events(args.a_log)
b_events = load_events(args.b_log)
b_pairs = pair_b_events(b_events)
merged = match_a_to_b(a_events, b_pairs)
summary = aggregate(merged)
print(f"loaded {len(a_events)} A events, {len(b_events)} B events; "
f"paired {len(b_pairs)} B; matched {len(merged)} (A∩B)")
print()
print(f"{'in_tok':>8} {'KV_MiB':>8} {'n':>4} "
f"{'pure_ms':>10} {'rx_ms':>10} {'overhead_ms':>12} "
f"{'GB/s_p50':>10} {'GB/s_max':>10}")
for s in summary:
print(f"{s['input_tokens']:>8} {s['kv_mib']:>8.1f} {s['n']:>4} "
f"{s['pure_transfer_ms_p50']:>10.1f} "
f"{s['rx_total_ms_mean']:>10.1f} "
f"{s['rx_overhead_ms_mean']:>12.1f} "
f"{s['throughput_gbps_p50']:>10.2f} "
f"{s['throughput_gbps_max']:>10.2f}")
if args.out:
args.out.parent.mkdir(parents=True, exist_ok=True)
args.out.write_text(json.dumps({
"rows": merged,
"summary": summary,
}, indent=2))
print(f"\nwrote {args.out}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,100 @@
#!/usr/bin/env python3
"""Plot MB2 transfer-time + bandwidth curves."""
from __future__ import annotations
import argparse
import json
from pathlib import Path
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
def main() -> None:
p = argparse.ArgumentParser()
p.add_argument("--breakdown", type=Path, required=True,
help="JSON from analyze_mb2.py")
p.add_argument("--out-time", type=Path, default=Path("figs/mb2_transfer_time.png"))
p.add_argument("--out-bw", type=Path, default=Path("figs/mb2_transfer_bw.png"))
p.add_argument("--label", default="intra-node (kv_both, dash1 GPU 0+1)")
args = p.parse_args()
d = json.loads(args.breakdown.read_text())
# Drop the spurious 16-token events (zero-byte sends produced by the
# connector during request init; not a real KV transfer).
rows = [r for r in d["rows"] if r["input_tokens_est"] >= 64]
summary = [s for s in d["summary"] if s["input_tokens"] >= 64]
kv_mib = [s["kv_mib"] for s in summary]
p50_ms = [s["pure_transfer_ms_p50"] for s in summary]
min_ms = [s["pure_transfer_ms_min"] for s in summary]
max_ms = [s["pure_transfer_ms_max"] for s in summary]
bw_p50 = [s["throughput_gbps_p50"] for s in summary]
bw_max = [s["throughput_gbps_max"] for s in summary]
# ---- pure transfer time vs KV size (log-log) ----
fig, ax = plt.subplots(figsize=(8, 5))
ax.errorbar(kv_mib, p50_ms,
yerr=[np.array(p50_ms) - np.array(min_ms),
np.array(max_ms) - np.array(p50_ms)],
fmt="o-", color="#1f77b4", lw=2, markersize=7,
capsize=4, label="pure_transfer (batch_transfer_sync_write)")
# 9.7 GB/s reference line
ref_bw_gbps = 9.7
ref_x = np.array(kv_mib)
ref_y_ms = (ref_x * 1024 * 1024) / (ref_bw_gbps * 1e9) * 1000
ax.plot(ref_x, ref_y_ms, "--", color="#888", alpha=0.7,
label=f"ideal {ref_bw_gbps:.1f} GB/s reference")
# agentic-relevant horizontal markers
for name, ms in [("typical chatbot decode (~5 s)", 5000),
("typical agentic decode (~50200 ms)", 100)]:
ax.axhline(ms, color="#c44e52", lw=0.8, ls=":", alpha=0.5)
ax.text(kv_mib[-1] * 0.85, ms * 1.15, name, fontsize=8,
color="#7a1d1d", ha="right")
# p99 agentic KV vertical marker
ax.axvline(11500, color="#c44e52", lw=0.8, ls=":", alpha=0.5)
ax.text(11500, 0.7, "p99 agentic\nrequest 11.5 GiB",
fontsize=8, color="#7a1d1d", ha="center")
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlabel("KV transfer size (MiB)")
ax.set_ylabel("Pure transfer time (ms, log)")
ax.set_title(f"MB2: KV transfer time vs size — {args.label}")
ax.grid(True, which="both", alpha=0.3)
ax.legend(loc="upper left", fontsize=9)
args.out_time.parent.mkdir(parents=True, exist_ok=True)
fig.tight_layout()
fig.savefig(args.out_time, dpi=150)
plt.close(fig)
print(f"wrote {args.out_time}")
# ---- bandwidth vs KV size ----
fig, ax = plt.subplots(figsize=(8, 5))
ax.plot(kv_mib, bw_p50, "o-", color="#2ca02c", lw=2, markersize=7,
label="bandwidth p50")
ax.plot(kv_mib, bw_max, "x--", color="#ff7f0e", lw=1.5, markersize=8,
label="bandwidth max")
ax.axhline(9.7, color="#888", ls="--", alpha=0.6,
label="steady-state ≈ 9.7 GB/s")
ax.set_xscale("log")
ax.set_xlabel("KV transfer size (MiB)")
ax.set_ylabel("Effective bandwidth (GB/s)")
ax.set_ylim(0, 12)
ax.set_title(f"MB2: KV transfer bandwidth vs size — {args.label}")
ax.grid(True, which="both", alpha=0.3)
ax.legend(loc="lower left", fontsize=9)
args.out_bw.parent.mkdir(parents=True, exist_ok=True)
fig.tight_layout()
fig.savefig(args.out_bw, dpi=150)
plt.close(fig)
print(f"wrote {args.out_bw}")
if __name__ == "__main__":
main()