adb5356c4b
Add advisory harness attribution and descriptor planner MVP
2026-06-30 12:05:03 +08:00
08429e5da8
Refine harness design flow overview
2026-06-29 20:41:54 +08:00
00ba573631
Document harness design contract
2026-06-29 20:26:58 +08:00
6ea259a0a3
Keep target topology explicit in delta projections
2026-06-29 19:56:50 +08:00
6b4efdad82
Relax lower-frontier delta projection gate
2026-06-29 17:57:29 +08:00
9ef9550214
Use full state for frontier projection
2026-06-29 16:22:09 +08:00
8dd9ada194
Add frontier delta projection harness candidates
2026-06-29 16:15:06 +08:00
6c84dc91d7
Document hardened topology feedback
2026-06-29 02:34:12 +08:00
1c4ed4cab3
Document hardened harness feedback
2026-06-29 02:28:30 +08:00
6b25d56c1f
Gate GMU climb on measured improvement
2026-06-29 02:00:41 +08:00
ee101a7c24
Harden prefill scheduler harness
2026-06-29 01:54:02 +08:00
bfd85793f3
Prioritize uncovered prefill scheduler candidates
2026-06-29 01:30:34 +08:00
36c301c128
Add normalized prefill scheduler harness
2026-06-29 01:12:19 +08:00
7ad439730e
Add llm-first tuning proposal policy
2026-06-27 12:21:51 +08:00
9accf2575e
Require harness proposals from candidate sets
2026-06-27 01:03:30 +08:00
bef260f183
Document bad-start robustness suite
2026-06-26 22:19:46 +08:00
2937539b49
Persist harness candidate set snapshots
2026-06-26 22:17:47 +08:00
5080b50315
Veto repeated materialized configs
2026-06-26 22:15:47 +08:00
825d3e03e9
Add harness candidate set audit
2026-06-26 22:02:09 +08:00
42f75553a6
Document full config signature validation
2026-06-26 21:52:18 +08:00
48911b658b
Use normalized full config signatures
2026-06-26 21:28:10 +08:00
7f50b8b8ea
Document bad-start validation results
2026-06-26 20:50:20 +08:00
c8a0f9870e
Tighten topology and auto-high validation
2026-06-26 20:07:23 +08:00
1dd3eaebaa
Add auto search high measurement policy
2026-06-26 20:05:22 +08:00
95ad124a1b
Document auto search high policy
2026-06-26 19:53:30 +08:00
384cb58f1f
Add declarative harness prototype
2026-06-26 18:07:02 +08:00
4075c7abf0
Design declarative intervention harness
2026-06-26 17:15:06 +08:00
92eb186006
Add bad-start harness recovery planning
2026-06-26 16:44:24 +08:00
ce36cd79af
Document no-LLM harness mechanism
2026-06-25 10:32:29 +08:00
013b01baa1
Stop after gmu ceiling validation is exhausted
2026-06-24 22:45:42 +08:00
b075afe6f2
Continue gmu hill-climb after topology validation
2026-06-24 19:09:35 +08:00
8fa758797e
Guard generic topology search from introducing EP
2026-06-24 15:21:22 +08:00
c245774d76
Ignore generated run configs
2026-06-24 11:48:21 +08:00
d85572e7b5
Update AITuner roadmap framing
2026-06-24 11:45:42 +08:00
c0a9235b80
Document vLLM-first harness roadmap
2026-06-24 11:23:39 +08:00
c4173b2b3b
Document remote proxy setup
2026-06-23 20:12:53 +08:00
6d874ecbff
Update Qwen235B progress snapshot
2026-06-23 18:24:57 +08:00
403ae2e2b7
Document Qwen235B 2x2 progress
2026-06-23 18:23:56 +08:00
861d754f29
Localize Qwen27B harness ablation doc
2026-06-23 18:14:35 +08:00
76ec19224c
Document Qwen27B 2x2 harness ablation
2026-06-23 10:08:46 +08:00
e67bc86240
Probe coupled prefill runtime knobs before stop
2026-06-22 19:30:23 +08:00
fd94ab9f3b
Prevent prefill convergence stop before seq probe
2026-06-22 14:43:55 +08:00
4607711bb5
Add reusable clean pair runner
2026-06-22 00:05:31 +08:00
d23b69219b
Add clean dash1 harness ablation runner
2026-06-21 00:51:08 +08:00
488fae7e63
Add tuning progress report for harness evaluation
2026-06-21 00:48:21 +08:00
426151bc9f
Harness stop uses full state baseline
2026-06-20 22:48:27 +08:00
a9d237bbfd
Show effective flags in ablation trajectory
2026-06-20 10:24:53 +08:00
5257fbc1a2
Improve harness incumbent follow-up search
2026-06-20 05:37:15 +08:00
b3156a382a
Harness: gate gpu-mem-util/seqs-raise on 'no untested TP increase' (frontier-closed)
...
The first gpt-5.5 verification run exposed a bug in the prior gate: topology_settled =
cur_tp>base_tp let gpu-memory-utilization fire on a TP2 incumbent (TP2>baseline TP1)
and preempt the still-open TP4 frontier -- the harness proposed TP2+gpu-mem-util=0.92
at iter 2 instead of climbing to TP4. The candidate path runs before the topology-
frontier check, so a score>=0.35 runtime candidate wins.
Fix: gate runtime micro-tuning (gpu-mem-util, raising max-num-seqs) on the TP frontier
being closed -- topology_settled = no untested _next_allowed_tp remains (respects GPU
count, so TP4 is the real ceiling on 6 GPUs). New regression test: TP2 incumbent with
TP4 reachable must climb TP and must NOT propose gpu-mem-util. 116 tests pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com >
2026-06-19 13:33:29 +08:00
76cca89a43
Add harness-only dash1 driver to verify the gpu-mem-util fix recovers ~0.87 + stops
...
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com >
2026-06-19 11:29:32 +08:00