New bin export_safetensors: load an xtrain checkpoint, map every param to its
HF Qwen3 tensor name, transpose 2D projection weights [in,out]->[out,in]
(1D norms + [vocab,dim] embed/lm_head kept), cast to BF16 (xserv's qwen3
forward is BF16-only), and write config.json + model.safetensors + a copy of
the gpt2 tokenizer.json. Sized exactly like bin/train.rs. safetensors 0.5 to
match xserv. GPU body gated behind not(no_cuda).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>