Files
xtrain/docs
Gahow Wang 8bd7db16e1 docs: T16 grad-accum results — evolution row + README build-journey
dash5-verified gate numbers: accum=N bit-close to N× big batch (loss
8.5e-8 / grad 3.8e-5), accum=1 bit-identical (0.0), DDP+accum matches
single-GPU (5.7e-7), memory flat (same effective batch 64: 27.7GB big →
7.2GB accum, −74%), xserv closed loop md5-identical + token-identical.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 23:52:32 +08:00
..