xtrain

Files

Gahow Wang abe5ceb913 test: grad-accum equivalence + accum=1 bit-identity + DDP+accum

- grad_accum.rs: accum=N×B grads bit-close to a single N·B big batch;
  accum_steps=1 bit-identical (max|Δ|==0) to no-accum; real train() loop
  with accum tracks a big-batch baseline over 20 AdamW steps.
- ddp_correctness.rs: world=2 + accum=2 matches a single-GPU big batch of
  the same effective size (loss + cross-rank + vs-baseline).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-17 23:45:40 +08:00

src

train+ddp: micro-batch gradient accumulation (--accum-steps)

2026-06-17 23:45:33 +08:00

tests

test: grad-accum equivalence + accum=1 bit-identity + DDP+accum

2026-06-17 23:45:40 +08:00

build.rs

dist: nccl ffi + comm bootstrap

2026-06-15 17:14:56 +08:00

Cargo.toml

dist: ddp all-reduce + sharded batch

2026-06-15 17:15:29 +08:00