Record decode validation follow-up

This commit is contained in:
2026-04-28 21:20:41 +08:00
parent 38ff4380e5
commit 46e9040613

View File

@@ -82,6 +82,12 @@ Follow-up implementation after this result:
- The proposal rules now explicitly say not to stop solely because a strong incumbent appeared.
- Proposal parsing now accepts structured `observation`/`diagnosis` by converting them to text, so a usable validation proposal is not dropped only because the LLM used an object instead of a string.
After the implementation fix, the previously rejected `proposal-0004` was resumed as a validation trial:
- `trial-0004`: same topology validation with `max-num-seqs=160`.
- Remote tmux: `aituner_qwen235b_decode_harness_validate_20260428`.
- Status as of 2026-04-28 13:20 UTC on dash0: running; no result has been written yet.
## Follow-up Fix
The seeded prompt exposed a generic diagnosis issue: if the best feasible probe had no latency failures, the harness could miss the prior infeasible probe that showed the real bottleneck at higher load. The harness now scans the probe sequence backward and uses the nearest non-trivial bottleneck before falling back to the best feasible probe. This keeps decode-only runs focused on `decode_tpot` after a feasible low-load point, without adding testcase thresholds.