Gahow Wang 63387f614d Full v3 trace re-profile with layer-wise: matched migrations improve
1213/1214 success; matched migrations (4 common) improved -2.6 to -7.2s,
scaling with prefill hidden behind transfer. Trace-level TTFT p90 -6% / p99
-5% (modest: migrations are 2% of reqs and partly queue-bound). Confirms
layer-wise removes the transfer half of migration overhead but not the
control-plane/queue residual. DESIGN.md updated with results.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 19:16:37 +08:00
Description
No description provided
48 MiB
Languages
Python 82.9%
Shell 17.1%