Commit Graph

2 Commits

Author SHA1 Message Date
83162e7a64 Ablation: pin gpt-5.5 @ ai.gahow.org (chat.completions); re-read token per arm
- Pin endpoint.model=gpt-5.5, base_url=https://ai.gahow.org/v1, wire_api=chat.completions
  in both ablation specs so both arms uniformly use the current ~/.codex model (the
  prior runs used the stale ai.prism.uno/gpt-5.4 that config.toml has since moved off).
- run_ablation_pair_d1.sh re-reads the codex token from auth.json right before each arm
  instead of capturing it once at launch (the stale-at-use capture 401'd naive 2/3).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 11:27:47 +08:00
0c23285f39 Fig18 substrate: real output_length + criterion-A time_scale + Stop-A drain deadline
Replace the out=128 / scale=0.5 ablation substrate with a paper-faithful one:
- Use the trace's real output_length (drop completion_tokens_override=128). The
  0-8k chat window has p50=531 / p99=2436 / max=35168 output tokens, so decode
  (TPOT) becomes the dominant bottleneck instead of an artificial 128-token cap.
- replay_time_scale=0.8775, chosen by criterion-A: binary-search the smallest
  scale whose A-family L-C-A similarity to the real (scale=1.0) arrivals stays
  >= tau (0.90). The old scale=0.5 had sim_A=0.56, distorting the arrival axis
  far below the tau bar used everywhere else. New calibrator:
  scripts/calibrate_time_scale.py.
- Per-probe Stop-A-consistent drain deadline (worker._probe_drain_deadline): the
  wall-clock a *feasible* config needs to drain the LCA-admitted set
  (last_arrival + worst-case TTFT + p99_out * TPOT budget + margin). With real
  outputs decode dominates wall-clock, so the old fixed 320s cap would truncate
  the Stop-A offered window mid-decode. early_stop_max_elapsed_s (1000s) is now a
  hard ceiling; the per-probe deadline governs. The lag cap still cuts overload.

12-iter paired driver (both arms on dash1, removes the dash0/dash1 host confound):
scripts/run_ablation_pair_d1.sh. 115 tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 17:24:00 +08:00