-
d8899c50ce
Add interaction screening matrix generator
main
Gahow Wang
2026-07-01 14:28:34 +08:00
-
46b477f48e
Add initial config preflight review
Gahow Wang
2026-07-01 11:12:58 +08:00
-
1b8f5a3af1
Integrate descriptor runtime candidates into harness
Gahow Wang
2026-06-30 14:10:19 +08:00
-
adb5356c4b
Add advisory harness attribution and descriptor planner MVP
Gahow Wang
2026-06-30 12:05:03 +08:00
-
08429e5da8
Refine harness design flow overview
Gahow Wang
2026-06-29 20:41:54 +08:00
-
00ba573631
Document harness design contract
Gahow Wang
2026-06-29 20:26:58 +08:00
-
6ea259a0a3
Keep target topology explicit in delta projections
Gahow Wang
2026-06-29 19:56:50 +08:00
-
6b4efdad82
Relax lower-frontier delta projection gate
Gahow Wang
2026-06-29 17:57:29 +08:00
-
9ef9550214
Use full state for frontier projection
Gahow Wang
2026-06-29 16:22:09 +08:00
-
8dd9ada194
Add frontier delta projection harness candidates
Gahow Wang
2026-06-29 16:15:06 +08:00
-
6c84dc91d7
Document hardened topology feedback
Gahow Wang
2026-06-29 02:34:12 +08:00
-
1c4ed4cab3
Document hardened harness feedback
Gahow Wang
2026-06-29 02:28:30 +08:00
-
6b25d56c1f
Gate GMU climb on measured improvement
Gahow Wang
2026-06-29 02:00:41 +08:00
-
ee101a7c24
Harden prefill scheduler harness
Gahow Wang
2026-06-29 01:54:02 +08:00
-
bfd85793f3
Prioritize uncovered prefill scheduler candidates
Gahow Wang
2026-06-29 01:30:34 +08:00
-
36c301c128
Add normalized prefill scheduler harness
Gahow Wang
2026-06-29 01:12:19 +08:00
-
7ad439730e
Add llm-first tuning proposal policy
Gahow Wang
2026-06-27 12:21:51 +08:00
-
9accf2575e
Require harness proposals from candidate sets
Gahow Wang
2026-06-27 01:03:30 +08:00
-
bef260f183
Document bad-start robustness suite
Gahow Wang
2026-06-26 22:19:46 +08:00
-
2937539b49
Persist harness candidate set snapshots
Gahow Wang
2026-06-26 22:17:47 +08:00
-
5080b50315
Veto repeated materialized configs
Gahow Wang
2026-06-26 22:15:47 +08:00
-
825d3e03e9
Add harness candidate set audit
Gahow Wang
2026-06-26 22:02:09 +08:00
-
42f75553a6
Document full config signature validation
Gahow Wang
2026-06-26 21:52:18 +08:00
-
48911b658b
Use normalized full config signatures
Gahow Wang
2026-06-26 21:28:10 +08:00
-
7f50b8b8ea
Document bad-start validation results
Gahow Wang
2026-06-26 20:50:20 +08:00
-
c8a0f9870e
Tighten topology and auto-high validation
Gahow Wang
2026-06-26 20:07:23 +08:00
-
1dd3eaebaa
Add auto search high measurement policy
Gahow Wang
2026-06-26 20:05:22 +08:00
-
95ad124a1b
Document auto search high policy
Gahow Wang
2026-06-26 19:53:30 +08:00
-
384cb58f1f
Add declarative harness prototype
Gahow Wang
2026-06-26 18:07:02 +08:00
-
4075c7abf0
Design declarative intervention harness
Gahow Wang
2026-06-26 17:15:06 +08:00
-
92eb186006
Add bad-start harness recovery planning
Gahow Wang
2026-06-26 16:44:24 +08:00
-
ce36cd79af
Document no-LLM harness mechanism
Gahow Wang
2026-06-25 10:32:29 +08:00
-
013b01baa1
Stop after gmu ceiling validation is exhausted
Gahow Wang
2026-06-24 22:45:42 +08:00
-
b075afe6f2
Continue gmu hill-climb after topology validation
Gahow Wang
2026-06-24 19:09:35 +08:00
-
8fa758797e
Guard generic topology search from introducing EP
Gahow Wang
2026-06-24 15:21:22 +08:00
-
c245774d76
Ignore generated run configs
feat/fig18-real-output-lca-substrate
Gahow Wang
2026-06-24 11:48:21 +08:00
-
d85572e7b5
Update AITuner roadmap framing
Gahow Wang
2026-06-24 11:45:42 +08:00
-
c0a9235b80
Document vLLM-first harness roadmap
Gahow Wang
2026-06-24 11:23:39 +08:00
-
c4173b2b3b
Document remote proxy setup
Gahow Wang
2026-06-23 20:12:53 +08:00
-
6d874ecbff
Update Qwen235B progress snapshot
Gahow Wang
2026-06-23 18:24:57 +08:00
-
403ae2e2b7
Document Qwen235B 2x2 progress
Gahow Wang
2026-06-23 18:23:56 +08:00
-
861d754f29
Localize Qwen27B harness ablation doc
Gahow Wang
2026-06-23 18:14:35 +08:00
-
76ec19224c
Document Qwen27B 2x2 harness ablation
Gahow Wang
2026-06-23 10:08:46 +08:00
-
e67bc86240
Probe coupled prefill runtime knobs before stop
Gahow Wang
2026-06-22 19:30:23 +08:00
-
fd94ab9f3b
Prevent prefill convergence stop before seq probe
Gahow Wang
2026-06-22 14:43:55 +08:00
-
4607711bb5
Add reusable clean pair runner
Gahow Wang
2026-06-22 00:05:31 +08:00
-
d23b69219b
Add clean dash1 harness ablation runner
Gahow Wang
2026-06-21 00:51:08 +08:00
-
488fae7e63
Add tuning progress report for harness evaluation
Gahow Wang
2026-06-21 00:48:21 +08:00
-
426151bc9f
Harness stop uses full state baseline
Gahow Wang
2026-06-20 22:48:27 +08:00
-
a9d237bbfd
Show effective flags in ablation trajectory
Gahow Wang
2026-06-20 10:24:53 +08:00
-
5257fbc1a2
Improve harness incumbent follow-up search
Gahow Wang
2026-06-20 05:37:15 +08:00
-
b3156a382a
Harness: gate gpu-mem-util/seqs-raise on 'no untested TP increase' (frontier-closed)
Gahow Wang
2026-06-19 13:33:29 +08:00
-
76cca89a43
Add harness-only dash1 driver to verify the gpu-mem-util fix recovers ~0.87 + stops
Gahow Wang
2026-06-19 11:29:32 +08:00
-
83162e7a64
Ablation: pin gpt-5.5 @ ai.gahow.org (chat.completions); re-read token per arm
Gahow Wang
2026-06-19 11:27:47 +08:00
-
a3523f5601
Harness: explore gpu-memory-utilization (and raise max-num-seqs) before Stop-B
Gahow Wang
2026-06-19 10:25:47 +08:00
-
95c02d7dd9
Fig-18: chained driver for 2 extra naive runs (n=3 nondeterminism)
Gahow Wang
2026-06-18 09:06:05 +08:00
-
a1b804f879
Ablation: search.high 0.25 -> 0.15 (skip wildly-infeasible top probes)
Gahow Wang
2026-06-17 22:11:52 +08:00
-
0c23285f39
Fig18 substrate: real output_length + criterion-A time_scale + Stop-A drain deadline
Gahow Wang
2026-06-17 17:24:00 +08:00
-
816765071f
Complete harness-vs-naive ablation: harness 3x faster + stops; naive nondeterministic
Gahow Wang
2026-06-17 13:03:26 +08:00
-
97d2ddabb1
Ablation driver: force direct LLM connection (codex proxy is dash0-local)
Gahow Wang
2026-06-17 10:05:44 +08:00
-
8e58b4033d
Note dash1 lacks LLM gateway access (naive-completion deferred to dash0)
Gahow Wang
2026-06-17 09:55:39 +08:00
-
b779f6e56a
Add dash1 naive-completion driver for the ablation
Gahow Wang
2026-06-17 09:52:54 +08:00
-
e7d1b3ba01
Harness-vs-naive ablation result: harness steers to TP & converges; naive wanders
Gahow Wang
2026-06-17 09:51:56 +08:00
-
579dd86698
Ablation: --skip-baseline so loops climb from first proposal
Gahow Wang
2026-06-16 20:59:46 +08:00
-
37342a5749
Add chained harness-vs-naive ablation driver (sequential runs + DONE marker)
Gahow Wang
2026-06-16 20:30:41 +08:00
-
5965f4fbbc
Ablation substrate: scale=0.5 + out=128 + 6 probes (TP1 measurable, tractable)
Gahow Wang
2026-06-16 20:29:30 +08:00
-
a1cbab0e69
Document harness-vs-naive ablation: setup, substrate calibration, blocker
Gahow Wang
2026-06-16 20:16:27 +08:00
-
0794efa249
Reduce ablation probe budget to 3 per trial for tractability
Gahow Wang
2026-06-16 20:01:19 +08:00
-
d975e57bb5
Scale ablation early-stop caps to the compressed window (scale=0.2)
Gahow Wang
2026-06-16 19:49:57 +08:00
-
a16016a876
Add harness vs naive ablation configs (27b, scale=0.2 substrate)
Gahow Wang
2026-06-16 19:31:23 +08:00
-
07f5d92e1d
Add consolidated two-stop summary doc
Gahow Wang
2026-06-16 19:16:28 +08:00
-
f2ff0faebd
Document Stop-B end-to-end on dense 27B: the improving climb + no-regression
feat/two-stop
Gahow Wang
2026-06-16 18:07:00 +08:00
-
4a64196a99
Add 27B Stop-B agentic-loop config (harness-driven, GPUs 2-7)
Gahow Wang
2026-06-16 09:08:46 +08:00
-
b17b213575
Tear down the engine on SIGTERM instead of orphaning it
Gahow Wang
2026-06-16 09:08:06 +08:00
-
93ce339d61
Document 27B TP sweep: per-GPU rises sharply with TP (dense), opposite of MoE
Gahow Wang
2026-06-16 01:54:40 +08:00
-
b1b74318f6
Pin 27B A/B to GPUs 2-7 (route around leaked GPU0/1 memory)
Gahow Wang
2026-06-15 23:01:22 +08:00
-
2fcaf80450
Wrap socket/timeout errors in HTTP client as HttpClientError
Gahow Wang
2026-06-15 22:58:28 +08:00
-
3541065675
Speed up 27B TP A/B: request_timeout 180s, search.high 0.125
Gahow Wang
2026-06-15 22:40:42 +08:00
-
7678c7d5e8
Switch 27B TP A/B to length-aware TTFT SLO (4s + L_in/8k), widen search
Gahow Wang
2026-06-15 20:35:23 +08:00
-
ed2bbe0323
Add linear_ms SLO rule (length-aware TTFT budget)
Gahow Wang
2026-06-15 20:35:23 +08:00
-
77af4ded2a
Flag Stop-B e2e per-GPU trajectory as non-benchmark (saturation + smoke regime)
Gahow Wang
2026-06-15 18:40:38 +08:00
-
4f45b546a1
Add 27B TP A/B (deterministic ground-truth: does TP2 beat TP1 per-GPU)
Gahow Wang
2026-06-15 18:39:54 +08:00
-
90c3eb51c8
Document Stop-B end-to-end validation (Phase 5)
Gahow Wang
2026-06-15 17:58:44 +08:00
-
0b6beafeb8
Phase 5: widen search.high to 1.0 to force multi-iteration Stop-B convergence
Gahow Wang
2026-06-15 17:12:32 +08:00
-
d4aff81691
Add Stop-B end-to-end config (agentic loop, Stop-A enabled)
Gahow Wang
2026-06-15 17:05:39 +08:00
-
f31e9ccfd5
Record Stop-A boundary-guard A/B: correct verdict, ~38% replay saved
Gahow Wang
2026-06-15 16:57:53 +08:00
-
03e556f0ab
Add Stop-A ON config (adaptive_stop enabled + boundary guard) for A/B
Gahow Wang
2026-06-15 16:25:24 +08:00
-
dfc823f972
Add Stop-A SLO-boundary guard
Gahow Wang
2026-06-15 16:25:24 +08:00
-
9f52812753
Document Stop-A validation: calibration + GPU fidelity check
Gahow Wang
2026-06-15 16:03:16 +08:00
-
958739027a
Fix Stop-A validation config: system vllm, cap max-model-len
Gahow Wang
2026-06-15 15:22:48 +08:00
-
0f57ee96a9
Drop LLM endpoint from Stop-A full-data config (baseline-only run)
Gahow Wang
2026-06-15 15:19:46 +08:00
-
43125f48cf
Address review of two-stop branch
Gahow Wang
2026-06-15 15:19:08 +08:00
-
3af1d84ac0
Add Stop-A full-data validation config (real-time replay, no cap)
Gahow Wang
2026-06-15 15:15:12 +08:00
-
08e53fd897
Add Stop-A calibration script (CPU-only convergence curve)
Gahow Wang
2026-06-15 15:10:02 +08:00
-
a8f903498d
Add Stop-B authority: deterministic validator overrides LLM stop
Gahow Wang
2026-06-15 14:45:14 +08:00
-
51a9e4a007
Add Stop-A: offered-L-C-A convergence early-stop for replay
Gahow Wang
2026-06-15 14:23:49 +08:00
-
0f15bbc3f1
Make the offered-load axis session-coherent
Gahow Wang
2026-06-15 14:16:06 +08:00
-
6f8e3c95c1
Unify harness L-C-A on the canonical lca.WorkloadProfile
Gahow Wang
2026-06-15 14:12:17 +08:00
-
8b4116fad0
Add reference paper and qwen27b tpot25 16-iter notes
Gahow Wang
2026-06-15 14:02:30 +08:00
-
27d1c8fa92
Add L-C-A workload profile metric and CLI profile commands
Gahow Wang
2026-06-15 14:02:24 +08:00