retry on ConnectError to handle kv_both connection instability
With RDMA_overhead=0.1s, offload triggers when C_s has just 700 tokens
pending (0.1s queue), vs 38k tokens (5.4s) with the old 2.0s estimate.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>