Commit Graph

2 Commits

Author SHA1 Message Date
Claude Code Agent
7216507773 feat(snapshot): D→P RDMA Phase 1b — GPU pointer path verified
Confirms snapshot_link works for cuda device pointers, not just host
memory. Sender on cuda:0 pushes to receiver on cuda:1 via RDMA over
mlx5_60. All 5 sizes (16K, 1M, 16M, 64M, 256M) pass SHA verification.

  16 KB     8.3 ms   0.016 Gbps  (cold openSegment)
  1 MB      0.10 ms  87.6 Gbps
  16 MB     0.84 ms  159 Gbps
  64 MB     2.52 ms  213 Gbps
  256 MB    8.54 ms  251 Gbps    (~60% NDR400 line rate)

For Inferact-scale sessions (~50K tokens × ~80 KB layer-per-token =
~4 GB), this projects D→P transfer time at ~130 ms — within the
"reseed-savings" envelope sketched in design doc §3.2.

Files:
  scripts/snapshot_link_receiver_gpu.py
  scripts/smoke_snapshot_link_gpu.py

Next: SGLang scheduler integration for D-side dump + P-side ingest.
2026-05-13 00:59:43 +08:00
Claude Code Agent
dc4867c270 feat(snapshot): D→P RDMA link Phase 1 — minimal byte transport
A thin wrapper around mooncake.engine.TransferEngine that does
one-sided RDMA writes between two SnapshotPeer endpoints. Bypasses
SGLang's MooncakeKVManager (which is hard-gated to PREFILL/DECODE
roles via add_transfer_request assertion at conn.py:1563) so the
D→P direction doesn't require invasive role-axis changes upstream.

Smoke test (two subprocess.Popen processes, mlx5_60, 127.0.0.1):
  1 KB    9.0 ms   (one-time openSegment handshake)
  16 KB   0.04 ms  3.5 Gbps
  1 MB    0.10 ms  82 Gbps
  16 MB   0.58 ms  232 Gbps
  64 MB   1.70 ms  316 Gbps   (~80% of NDR 400G line rate)

All 5 sizes pass SHA256 verification end-to-end.

Files:
  src/agentic_pd_hybrid/snapshot_link.py — SnapshotPeer, SnapshotEndpoint
  scripts/snapshot_link_receiver.py      — child-process receiver
  scripts/smoke_snapshot_link.py         — sender + verifier
  docs/D_TO_P_PHASE1_LINK_ZH.md          — phase 1 acceptance doc

Next: Phase 2 (D-side scheduler commit hook), Phase 3 (P-side prefill
bypass with snapshot KV). See docs/D_TO_P_SYNC_DESIGN_ZH.md §5.
2026-05-13 00:55:55 +08:00