This website requires JavaScript.
Explore
Help
Sign In
gahow
/
xtrain
Watch
1
Star
0
Fork
0
You've already forked xtrain
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
9e958cb0f974b16b8adaee1d8273562b7abd4961
xtrain
/
crates
History
Gahow Wang
9b05f4f93f
test: flash==composed bf16 uses robust mean/p99 metric (repo convention)
...
Co-Authored-By: Claude Opus 4.8 <
noreply@anthropic.com
>
2026-06-17 23:19:08 +08:00
..
xtrain-autodiff
test: eps=2e-3 for flash dQ/dK finite-diff (cuts f32 rounding term)
2026-06-17 23:17:44 +08:00
xtrain-cuda
cuda: fused flash-attention kernel (fwd + flash-style bwd)
2026-06-17 23:10:25 +08:00
xtrain-distributed
test+bins: flash grad-check, flash==composed, PyTorch parity, --flash flag
2026-06-17 23:10:39 +08:00
xtrain-model
test: flash==composed bf16 uses robust mean/p99 metric (repo convention)
2026-06-17 23:19:08 +08:00
xtrain-optim
perf: make xtrain-cuda a regular dep of xtrain-optim (GPU AdamW)
2026-06-15 16:53:52 +08:00
xtrain-tensor
cuda: fused flash-attention kernel (fwd + flash-style bwd)
2026-06-17 23:10:25 +08:00
xtrain-train
test+bins: flash grad-check, flash==composed, PyTorch parity, --flash flag
2026-06-17 23:10:39 +08:00