Files
xtrain/crates/xtrain-distributed
Gahow Wang 6465a2d5ce test: T21-for-proc — clear ENV_DROPOUT across tests to sever ordering coupling
libtest with --test-threads=1 (the documented invariant for this file's DDP
tests) runs tests alphabetically. The new
proc_per_gpu_dropout_is_live_and_p0_matches_no_dropout ('d') runs BEFORE
proc_per_gpu_matches_single_gpu_and_thread_path ('m'). It sets ENV_DROPOUT=0.2
via std::env::set_var; if left in place, the correctness test's spawned workers
would inherit it (Command inherits parent env by default) and build with
cfg.dropout=0.2 while its single-GPU baseline (run_single_gpu → test_config →
dropout=0) stays at 0 — GATE (a) `max_rel_single < 1e-3` would blow up by
orders of magnitude.

Two defenses:
- correctness test remove_var(ENV_DROPOUT) before spawn (belt): even if the
  dropout test forgot to clean up, this test starts from a clean env.
- dropout test remove_var(ENV_DROPOUT, ENV_DUMP_DIR) at exit (suspenders):
  keep the invariant "each test leaves the env as it found it" so any future
  test added after these two starts clean too.

Same --test-threads=1 SAFETY comment applies (no concurrent env access).
2026-07-01 14:09:42 +08:00
..
2026-06-15 17:14:56 +08:00