This website requires JavaScript.
Explore
Help
Sign In
Gahow Wang
gahow
0 Followers
·
0 Following
Joined on
2026-04-03
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
17
Projects
Packages
Public Activity
Starred Repositories
gahow
pushed to
t20-capstone
at
gahow/xtrain
2026-06-18 10:11:54 +00:00
db70abe450
docs: T20 — Phase-2 systems-depth capstone (reframe README to two phases)
gahow
created branch
t20-capstone
in
gahow/xtrain
2026-06-18 10:11:54 +00:00
gahow
pushed to
main
at
gahow/xtrain
2026-06-18 10:05:08 +00:00
71b0a1621f
docs: T17 process-per-GPU results — measured throughput-neutral
4abb17383a
test: process-per-GPU DDP correctness (ddp_proc.rs)
a188c8a277
distributed: train_ddp_mp bin (process-per-GPU launcher/worker)
ffd548b80b
distributed: process-per-GPU launcher + worker (proc.rs)
c470c627a7
docs: Phase T17 — process-per-GPU DDP design
Compare 5 commits »
gahow
pushed to
t17-process-per-gpu
at
gahow/xtrain
2026-06-18 10:03:21 +00:00
71b0a1621f
docs: T17 process-per-GPU results — measured throughput-neutral
gahow
pushed to
t17-process-per-gpu
at
gahow/xtrain
2026-06-18 09:48:58 +00:00
4abb17383a
test: process-per-GPU DDP correctness (ddp_proc.rs)
a188c8a277
distributed: train_ddp_mp bin (process-per-GPU launcher/worker)
ffd548b80b
distributed: process-per-GPU launcher + worker (proc.rs)
c470c627a7
docs: Phase T17 — process-per-GPU DDP design
Compare 4 commits »
gahow
created branch
t17-process-per-gpu
in
gahow/xtrain
2026-06-18 09:48:58 +00:00
gahow
pushed to
main
at
gahow/xtrain
2026-06-18 09:39:26 +00:00
2ff4573a31
docs: T15 GQA results + evolution row (模型架构) + README build-journey row
39df0b40c1
gqa: fix kv-proj shape test param indices (embed,attn_norm precede wq)
830d06ad01
gqa: real grouped-query attention (repeat_kv op + both SDPA paths + wiring + tests)
62b1cb5dc7
docs: Phase T15 — GQA design (repeat_kv broadcast op + backward grad-sum)
Compare 4 commits »
gahow
pushed to
feat/fig18-real-output-lca-substrate
at
gahow/aituner
2026-06-18 01:06:07 +00:00
95c02d7dd9
Fig-18: chained driver for 2 extra naive runs (n=3 nondeterminism)
gahow
pushed to
t15-gqa
at
gahow/xtrain
2026-06-17 17:45:00 +00:00
2ff4573a31
docs: T15 GQA results + evolution row (模型架构) + README build-journey row
gahow
pushed to
t15-gqa
at
gahow/xtrain
2026-06-17 17:38:43 +00:00
39df0b40c1
gqa: fix kv-proj shape test param indices (embed,attn_norm precede wq)
gahow
pushed to
t15-gqa
at
gahow/xtrain
2026-06-17 17:37:44 +00:00
830d06ad01
gqa: real grouped-query attention (repeat_kv op + both SDPA paths + wiring + tests)
62b1cb5dc7
docs: Phase T15 — GQA design (repeat_kv broadcast op + backward grad-sum)
Compare 2 commits »
gahow
created branch
t15-gqa
in
gahow/xtrain
2026-06-17 17:37:44 +00:00
gahow
deleted branch integration-t14-t16-t18 from
gahow/xtrain
2026-06-17 17:24:04 +00:00
gahow
pushed to
main
at
gahow/xtrain
2026-06-17 17:23:57 +00:00
4b6d3e0a79
test: flash+dropout cross-feature grad-check (Phase-2 integration)
c36cdf74d1
Merge t18-dropout into main
f26db882e5
Merge t16-grad-accum into main
9e958cb0f9
Merge t14-flash-attention into main
80fafa1914
docs: T18 evolution row + README build-journey row (dropout)
Compare 26 commits »
gahow
pushed to
integration-t14-t16-t18
at
gahow/xtrain
2026-06-17 16:43:55 +00:00
4b6d3e0a79
test: flash+dropout cross-feature grad-check (Phase-2 integration)
gahow
pushed to
integration-t14-t16-t18
at
gahow/xtrain
2026-06-17 16:41:56 +00:00
c36cdf74d1
Merge t18-dropout into main
f26db882e5
Merge t16-grad-accum into main
9e958cb0f9
Merge t14-flash-attention into main
Compare 3 commits »
gahow
created branch
integration-t14-t16-t18
in
gahow/xtrain
2026-06-17 16:41:56 +00:00
gahow
pushed to
t18-dropout
at
gahow/xtrain
2026-06-17 16:06:12 +00:00
80fafa1914
docs: T18 evolution row + README build-journey row (dropout)
e625aa05dd
dropout: wire into model (residual sites) + train/eval switch + flag (T18)
5eb27783f8
dropout: autodiff op + fixed-seed grad-check (T18)
1fdd0c5002
dropout: device RNG kernel + Tensor fwd/bwd (T18)
6b8c1e4e0f
docs: Phase T18 — dropout design (device RNG + mask)
Compare 5 commits »
gahow
created branch
t18-dropout
in
gahow/xtrain
2026-06-17 16:06:12 +00:00
gahow
pushed to
t16-grad-accum
at
gahow/xtrain
2026-06-17 15:52:33 +00:00
8bd7db16e1
docs: T16 grad-accum results — evolution row + README build-journey
First
Previous
...
5
6
7
8
9
...
Next
Last