Files
xtrain/crates/xtrain-tensor
Gahow Wang 1fdd0c5002 dropout: device RNG kernel + Tensor fwd/bwd (T18)
csrc/ops/dropout.cu: counter-based RNG (splitmix64 over seed^index) → fp32
uniform → Bernoulli(keep=1-p); fwd writes out=x⊙mask + an fp32 mask buffer
(per-element 1/(1-p) or 0); bwd applies the same mask (dx=d⊙mask). fp32 + bf16
activation variants (mask fp32 in both; uniform is dtype-independent so masks
match across precisions). Stateless → re-run with same seed = same mask (T13
recompute-safe). Registered in build.rs + FFI decls.

Tensor::dropout(p,seed)->(out,mask) and Tensor::dropout_backward(d,mask) wrap the
launches (contiguous F32/BF16, default stream, per-op sync via the kernels).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 00:05:18 +08:00
..