03e556f0ab
Add Stop-A ON config (adaptive_stop enabled + boundary guard) for A/B
...
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com >
2026-06-15 16:25:24 +08:00
958739027a
Fix Stop-A validation config: system vllm, cap max-model-len
...
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com >
2026-06-15 15:22:48 +08:00
0f57ee96a9
Drop LLM endpoint from Stop-A full-data config (baseline-only run)
...
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com >
2026-06-15 15:19:46 +08:00
3af1d84ac0
Add Stop-A full-data validation config (real-time replay, no cap)
...
A single-config baseline run with adaptive_stop disabled and replay_time_scale=1.0,
so per-request probe_details capture the full 600s window for offline analysis of
whether truncating at the L-C-A convergence prefix preserves the feasibility verdict.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com >
2026-06-15 15:15:12 +08:00
ccbf24ac47
Use time-compressed community vllm ablation
2026-05-02 10:03:59 +08:00
d3d4c234f6
Bound community vllm ablation replay
2026-05-02 09:58:56 +08:00
4ef69cce78
Make harness stop conservative for ablation
2026-05-02 09:47:16 +08:00
664aeb49b2
Use local cache for qwen30b vllm runs
2026-05-02 08:47:16 +08:00
1880e859b5
Use vllm cu129 wheel on dash0
2026-05-02 08:28:23 +08:00
e215827503
Use uv auto torch backend for vllm 0.20
2026-05-02 08:21:27 +08:00
a7c9518ef6
Use local vllm venv for dash0 community run
2026-05-02 08:17:04 +08:00
1a3d628268
Add harness early stop ablation
2026-05-02 08:08:14 +08:00
6d3459c82d
Document decode harness one-shot mechanism
2026-05-02 06:25:06 +08:00
9919b9a7bd
configs: add q235b prefill 1s 2s 0-32k study
2026-04-17 19:25:32 +08:00
34eb495b3e
configs: add qwen235b prefill 0-32k study
2026-04-17 19:20:44 +08:00
26f3b46966
compare: add multi-candidate runner
2026-04-13 20:50:39 +08:00
18ff644b32
configs: add qwen235b prefill tight ttft 0323 study
2026-04-13 09:39:32 +08:00
edfd61a696
Add qwen235b prefill docs and tight TTFT spec
2026-04-12 11:24:23 +08:00
3f20ddf87e
Add qwen235b prefill-only tuning support
2026-04-11 21:00:02 +08:00
5e54e9c8f5
Add multi-window baseline vs tuned compare flow
2026-04-11 13:51:54 +08:00
31dd44c54b
Align qwen27b baseline proposal to TP1 run script
2026-04-11 00:40:05 +08:00
a4d54442db
Fix topology-aware incumbents for qwen27b tuning
2026-04-11 00:32:41 +08:00
06d4c380b3
Align qwen27b baseline proposal with topology study
2026-04-10 17:43:02 +08:00
8d0777e5e2
Add topology-aware qwen27b 0-8k tuning
2026-04-10 17:41:54 +08:00
9422d43737
Prioritize topology exploration in decode tuning
2026-04-10 10:25:41 +08:00
d582a8ed1b
Validate served model name consistency
2026-04-09 22:50:23 +08:00
ef78fe7eb5
Add topology-aware tuning constraints
2026-04-09 21:07:51 +08:00
581ef7ccea
Add qwen235b decode TPOT40 study config
2026-04-09 12:57:05 +08:00
c158807fac
Add decode-only study mode support
2026-04-09 11:23:17 +08:00
94c89e1103
Add codex and bailian LLM provider presets
2026-04-07 11:31:26 +08:00
46ed688ace
Add trace length bucket tuning support
2026-04-07 11:03:16 +08:00
e9b5e9b957
Add targeted low-threshold probe specs
2026-04-05 02:08:27 +08:00
84c5d6bd80
Add deeper infeasible probe diagnostics
2026-04-05 01:44:38 +08:00
8b024c72f1
Tighten LLM proposal schema
2026-04-04 23:24:32 +08:00
7e8523fdaa
Add probe early stop guards
2026-04-04 22:58:33 +08:00
56fa6747d2
Add replay time scaling for smoke tuning
2026-04-04 22:40:49 +08:00
dcb972014a
Enable BLADNN for dash0 fp4 smoke study
2026-04-04 22:32:55 +08:00
f192c741ed
Add study tune loop and smoke configs
2026-04-04 22:29:59 +08:00
7b7eaafd78
Use time-based trace window ids
2026-04-04 22:09:43 +08:00
gahow
cdcca1d9d7
Initial AITuner study orchestrator
2026-04-04 21:26:37 +08:00