This website requires JavaScript.
Explore
Help
Sign In
Gahow Wang
gahow
0 Followers
·
0 Following
Joined on
2026-04-03
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
User to block:
Optional note:
The note is not visible to the blocked user.
Cancel
Block
Repositories
17
Projects
Packages
Public Activity
Starred Repositories
gahow
pushed to
main
at
gahow/aituner
2026-06-17 01:55:41 +00:00
8e58b4033d
Note dash1 lacks LLM gateway access (naive-completion deferred to dash0)
gahow
pushed to
main
at
gahow/aituner
2026-06-17 01:52:56 +00:00
b779f6e56a
Add dash1 naive-completion driver for the ablation
gahow
pushed to
main
at
gahow/aituner
2026-06-17 01:51:59 +00:00
e7d1b3ba01
Harness-vs-naive ablation result: harness steers to TP & converges; naive wanders
gahow
pushed to
main
at
gahow/xtrain
2026-06-17 01:50:38 +00:00
0150263055
perf: KI-3 fixed — dim1024 batch32 fits, mem 31.1→14.6GB, tok/s 39.7K→31.5K
gahow
pushed to
main
at
gahow/xtrain
2026-06-17 01:45:16 +00:00
69c5f07359
docs: Phase T13 — activation recompute
gahow
pushed to
main
at
gahow/xtrain
2026-06-17 01:44:03 +00:00
a12dcf18d0
docs: Phase T13 — activation recompute
f202351be5
model: per-block activation recompute (--recompute)
c396b39483
autodiff: checkpoint primitive (recompute-on-backward)
Compare 3 commits »
gahow
pushed to
main
at
gahow/xtrain
2026-06-16 19:55:53 +00:00
9c557f0609
docs: run v7 — FineWeb subset near-ceiling at dim768 (val 3.01)
gahow
pushed to
main
at
gahow/xtrain
2026-06-16 14:21:49 +00:00
b4bb426d48
docs: run v6 — FineWeb-edu graduation (val 3.07, new distribution)
gahow
pushed to
main
at
gahow/aituner
2026-06-16 12:59:48 +00:00
579dd86698
Ablation: --skip-baseline so loops climb from first proposal
gahow
pushed to
main
at
gahow/aituner
2026-06-16 12:30:42 +00:00
37342a5749
Add chained harness-vs-naive ablation driver (sequential runs + DONE marker)
gahow
pushed to
main
at
gahow/aituner
2026-06-16 12:29:31 +00:00
5965f4fbbc
Ablation substrate: scale=0.5 + out=128 + 6 probes (TP1 measurable, tractable)
gahow
pushed to
main
at
gahow/aituner
2026-06-16 12:16:32 +00:00
a1cbab0e69
Document harness-vs-naive ablation: setup, substrate calibration, blocker
gahow
pushed to
main
at
gahow/aituner
2026-06-16 12:01:20 +00:00
0794efa249
Reduce ablation probe budget to 3 per trial for tractability
gahow
pushed to
main
at
gahow/aituner
2026-06-16 11:50:07 +00:00
d975e57bb5
Scale ablation early-stop caps to the compressed window (scale=0.2)
gahow
pushed to
main
at
gahow/aituner
2026-06-16 11:31:27 +00:00
a16016a876
Add harness vs naive ablation configs (27b, scale=0.2 substrate)
gahow
pushed to
main
at
gahow/xtrain
2026-06-16 11:30:54 +00:00
88bec270af
docs: evolution overview — per-milestone changes across algorithm/arch/infra/dataset axes
gahow
pushed to
main
at
gahow/aituner
2026-06-16 11:16:43 +00:00
07f5d92e1d
Add consolidated two-stop summary doc
f2ff0faebd
Document Stop-B end-to-end on dense 27B: the improving climb + no-regression
4a64196a99
Add 27B Stop-B agentic-loop config (harness-driven, GPUs 2-7)
b17b213575
Tear down the engine on SIGTERM instead of orphaning it
93ce339d61
Document 27B TP sweep: per-GPU rises sharply with TP (dense), opposite of MoE
Compare 28 commits »
gahow
pushed to
main
at
gahow/xtrain
2026-06-16 11:04:47 +00:00
7e5ea9976b
data: FineWeb-edu parquet->txt prep script (Scaling v6)
gahow
pushed to
feat/two-stop
at
gahow/aituner
2026-06-16 10:07:01 +00:00
f2ff0faebd
Document Stop-B end-to-end on dense 27B: the improving climb + no-regression
gahow
pushed to
main
at
gahow/xtrain
2026-06-16 09:56:32 +00:00
579365f4a0
docs: run v5 — TinyStories saturation at dim768 (val 1.11)
8a1e29543b
run: v5 archive + export (dim768, bf16, 5.33ep, val 1.11)
Compare 2 commits »
First
Previous
...
7
8
9
10
11
...
Next
Last