MB5 PD ablation v2 results: concurrency axis + reuse 3-way writeup
- fig3_conc32k.json + fig3_concurrency_axis.png: agentic-corner concurrency sweep (in=32768, reuse=0.984, out=128), N 8->128, PD capped 600s / colo uncapped. colo completes 100% at every N (graceful, E2E 2.4s->81s); every static PD split collapses, earlier as N rises (viable only N<=16; <27% by N=32). - analysis/mb5_pd_ablation/README.md: self-contained v2 writeup. Reuse axis 3-way (A=d2048/o256, C=d2048/o128, B=d1024/o128) decomposes shape: output ~negligible, delta (real prefill/turn) dominant; crossover to colo at reuse ~90-95% robust. Run on dash2 (dash0 NICs faulted for Mooncake). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Binary file not shown.
|
Before Width: | Height: | Size: 158 KiB After Width: | Height: | Size: 171 KiB |
Reference in New Issue
Block a user