Gahow Wang 50f72d8875 MB2 inter-node scaffolding: per-host single-instance launcher + client host args
Adds the pieces needed to run the producer on dash1 and the consumer on
dash2 with the same shared cpfs venv:

start_vllm_single.sh
  INSTANCE / GPU / PORT / BP / MASTER / ROLE env vars; brings up ONE
  vLLM instance + applies the mooncake instrumentation patch (idempotent
  since the venv is cpfs-shared, so the first invocation applies and the
  second is a no-op). Per-instance MB2_LOG_DIR keeps producer/consumer
  events separate even though both directories live on the same cpfs
  path visible to both hosts.

mb2_kv_transfer.py
  New --src-host / --dst-host args. Defaults stay 127.0.0.1 for
  backward-compat with the intra-node sweep. /v1/completions URLs and
  /query URLs now use the supplied hosts. remote_bootstrap_addr is
  built as http://<src_host>:<src_bp> so the consumer's
  do_remote_prefill request carries a routable address.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 20:26:54 +08:00
Description
No description provided
48 MiB
Languages
Python 82.9%
Shell 17.1%