New xtrain-optim crate. AdamW with per-param m/v moments keyed by params() index, global bias correction, and decoupled weight decay (matches torch.optim.AdamW). Split into a pure-host step_host (flat f32 buffers, unit-testable on a GPU-less host) and a step(&[Var]) wrapper that round-trips each param value/grad through the GPU tensor (gated not(no_cuda)). Per-step lr argument leaves room for an LR schedule. Host unit test checks the update against an independent reference recurrence over 20 steps and the pure-decay (g=0) boundary. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
19 lines
303 B
TOML
19 lines
303 B
TOML
[workspace]
|
|
resolver = "2"
|
|
members = [
|
|
"crates/xtrain-cuda",
|
|
"crates/xtrain-tensor",
|
|
"crates/xtrain-autodiff",
|
|
"crates/xtrain-model",
|
|
"crates/xtrain-optim",
|
|
]
|
|
|
|
[workspace.package]
|
|
version = "0.1.0"
|
|
edition = "2024"
|
|
license = "MIT"
|
|
|
|
[workspace.dependencies]
|
|
half = "2"
|
|
smallvec = "1"
|