Gahow Wang gahow
  • Joined on 2026-04-03
gahow pushed to feat/fig18-real-output-lca-substrate at gahow/aituner 2026-06-19 03:29:34 +00:00
76cca89a43 Add harness-only dash1 driver to verify the gpu-mem-util fix recovers ~0.87 + stops
gahow pushed to feat/fig18-real-output-lca-substrate at gahow/aituner 2026-06-19 03:27:49 +00:00
83162e7a64 Ablation: pin gpt-5.5 @ ai.gahow.org (chat.completions); re-read token per arm
a3523f5601 Harness: explore gpu-memory-utilization (and raise max-num-seqs) before Stop-B
Compare 2 commits »
gahow deleted branch t21-ddp-dropout from gahow/xtrain 2026-06-18 15:27:25 +00:00
gahow pushed to main at gahow/xtrain 2026-06-18 15:27:25 +00:00
a1370446fe docs: T21 — record DDP-dropout wiring gap + fix (known-issues / evolution / dropout doc)
980605474b test: T21 — DDP-dropout regression (live under DDP + p=0 bit-identical)
81f3cf59e5 distributed: T21 — wire dropout into the DDP path (--dropout + model.train())
Compare 3 commits »
gahow pushed to t21-ddp-dropout at gahow/xtrain 2026-06-18 13:23:03 +00:00
a1370446fe docs: T21 — record DDP-dropout wiring gap + fix (known-issues / evolution / dropout doc)
980605474b test: T21 — DDP-dropout regression (live under DDP + p=0 bit-identical)
Compare 2 commits »
gahow pushed to t21-ddp-dropout at gahow/xtrain 2026-06-18 13:17:54 +00:00
a02a0ee9ca fixup! test: T21 — DDP-dropout regression (live under DDP + p=0 bit-identical)
gahow pushed to t21-ddp-dropout at gahow/xtrain 2026-06-18 13:16:45 +00:00
41c25271e6 fixup! test: T21 — DDP-dropout regression (live under DDP + p=0 bit-identical)
gahow pushed to t21-ddp-dropout at gahow/xtrain 2026-06-18 13:15:13 +00:00
3c88fb7178 fixup! test: T21 — DDP-dropout regression (live under DDP + p=0 bit-identical)
gahow pushed to t21-ddp-dropout at gahow/xtrain 2026-06-18 13:14:04 +00:00
f7e893282a docs: T21 — record DDP-dropout wiring gap + fix (known-issues / evolution / dropout doc)
a447631c4b test: T21 — DDP-dropout regression (live under DDP + p=0 bit-identical)
Compare 2 commits »
gahow pushed to t21-ddp-dropout at gahow/xtrain 2026-06-18 13:12:17 +00:00
24c3da2f42 docs: T21 — record DDP-dropout wiring gap + fix (known-issues / evolution / dropout doc)
c35d3851d2 test: T21 — DDP-dropout regression (live under DDP + p=0 bit-identical)
Compare 2 commits »
gahow pushed to t21-ddp-dropout at gahow/xtrain 2026-06-18 13:08:42 +00:00
1eef10afd9 docs: T21 — record DDP-dropout wiring gap + fix (known-issues / evolution / dropout doc)
1b58bd8626 test: T21 — DDP-dropout regression (live under DDP + p=0 bit-identical)
81f3cf59e5 distributed: T21 — wire dropout into the DDP path (--dropout + model.train())
Compare 3 commits »
gahow created branch t21-ddp-dropout in gahow/xtrain 2026-06-18 13:08:42 +00:00
gahow deleted branch t20-capstone from gahow/xtrain 2026-06-18 11:51:55 +00:00
gahow deleted branch t18-dropout from gahow/xtrain 2026-06-18 11:51:55 +00:00
gahow deleted branch t17-process-per-gpu from gahow/xtrain 2026-06-18 11:51:55 +00:00
gahow deleted branch t16-grad-accum from gahow/xtrain 2026-06-18 11:51:55 +00:00
gahow deleted branch t15-gqa from gahow/xtrain 2026-06-18 11:51:55 +00:00
gahow deleted branch t14-flash-attention from gahow/xtrain 2026-06-18 11:51:55 +00:00
gahow pushed to main at gahow/xtrain 2026-06-18 11:51:44 +00:00
db70abe450 docs: T20 — Phase-2 systems-depth capstone (reframe README to two phases)
gahow pushed to main at gahow/xserv 2026-06-18 10:12:07 +00:00
531cd3fe08 style: format Rust workspace
013465fc06 docs: Phase 21 — decode CUDA graph + GPU argmax results
8414f8d1e6 sampling: GPU argmax fast path for greedy decode
34224c7c93 gpt-oss: replay the whole batch=1 decode step as one CUDA graph
4088f49b7d cuda: infrastructure for whole-step CUDA graph capture
Compare 10 commits »