Same outputs/inferact_50sess.jsonl subset as E1/E2 (md5
7bb263a32600ef5a6ef5099ba340a487). Identical to E2 except adds
--kvcache-load-floor-bonus 200. Tests three hypotheses:
H1 (load balance): D2 receives non-trivial bindings (E1/E2: 0)
H2 (failure rate): mooncake batch_transfer timeouts disappear
because D0/D1 KV pool no longer saturates
(E2 had 1054 fails; expect ≤ E1's 85)
H3 (TTFT): E2's 0.43s p50 (over the 231 successes)
generalizes to most reqs once cascade is gone
K override via LOAD_FLOOR_BONUS env var (default 200).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>