Files
xtrain/crates
Gahow Wang 15f1e526c7 train: parameterize model size (scaling ladder)
Add Config::from_arch(vocab, n_heads, head_dim, n_layers, ffn) so the model
size is a tunable rung instead of a hardcoded tiny config, and Config::core_params()
(num_params minus the two vocab×dim tables) — the figure the ladder is sized
against (the 50257-vocab embed+lm_head adds a fixed ~25M that is not capacity).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 18:34:39 +08:00
..