Adds the pieces needed to run the producer on dash1 and the consumer on
dash2 with the same shared cpfs venv:
start_vllm_single.sh
INSTANCE / GPU / PORT / BP / MASTER / ROLE env vars; brings up ONE
vLLM instance + applies the mooncake instrumentation patch (idempotent
since the venv is cpfs-shared, so the first invocation applies and the
second is a no-op). Per-instance MB2_LOG_DIR keeps producer/consumer
events separate even though both directories live on the same cpfs
path visible to both hosts.
mb2_kv_transfer.py
New --src-host / --dst-host args. Defaults stay 127.0.0.1 for
backward-compat with the intra-node sweep. /v1/completions URLs and
/query URLs now use the supplied hosts. remote_bootstrap_addr is
built as http://<src_host>:<src_bp> so the consumer's
do_remote_prefill request carries a routable address.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>