Commit Graph

88 Commits

Author SHA1 Message Date
d3d4c234f6 Bound community vllm ablation replay 2026-05-02 09:58:56 +08:00
4ef69cce78 Make harness stop conservative for ablation 2026-05-02 09:47:16 +08:00
664aeb49b2 Use local cache for qwen30b vllm runs 2026-05-02 08:47:16 +08:00
1880e859b5 Use vllm cu129 wheel on dash0 2026-05-02 08:28:23 +08:00
e215827503 Use uv auto torch backend for vllm 0.20 2026-05-02 08:21:27 +08:00
a7c9518ef6 Use local vllm venv for dash0 community run 2026-05-02 08:17:04 +08:00
1a3d628268 Add harness early stop ablation 2026-05-02 08:08:14 +08:00
6d3459c82d Document decode harness one-shot mechanism 2026-05-02 06:25:06 +08:00
9e5394b557 Inherit incumbent topology for runtime validation 2026-04-30 09:33:49 +08:00
f59919e21c Clarify base-relative validation patches 2026-04-30 06:52:09 +08:00
46e9040613 Record decode validation follow-up 2026-04-28 21:20:41 +08:00
38ff4380e5 Make strong incumbent trigger validation phase 2026-04-28 20:54:05 +08:00
68cdaf56a8 Summarize qwen235b decode harness result 2026-04-28 20:36:17 +08:00
f982395aad Record qwen235b decode harness launch 2026-04-28 07:02:13 +08:00
c9089cf4f0 Ignore non-SLO probe bookkeeping in bottleneck diagnosis 2026-04-28 06:58:38 +08:00
a9943e0240 Use probe sequence bottlenecks in harness 2026-04-28 06:57:45 +08:00
39aa47fbf1 Add generic decode-only harness guidance 2026-04-28 06:46:18 +08:00
71902b9fc2 Record qwen235b harness convergence test 2026-04-27 18:59:25 +08:00
bc884f6701 Document AITuner harness behavior 2026-04-27 16:34:19 +08:00
a962781b6c Document qwen27b harness convergence curve 2026-04-26 01:32:18 +08:00
29d0548e06 Stop after strong incumbent harness gains 2026-04-26 01:29:05 +08:00
a53445868e Make early-stop engine relaunch opt-in 2026-04-26 01:26:26 +08:00
d76ac49198 Relaunch engine after early-stopped probes 2026-04-26 00:32:39 +08:00
440f5b491b Record plateau guard verification 2026-04-25 18:50:23 +08:00
6bac389aae Add infeasible plateau guard to harness 2026-04-25 18:49:23 +08:00
6c04b9dbbc Evaluate baseline before LLM tuning 2026-04-25 17:14:05 +08:00
2d7ebe50ee Drain inflight requests after early stop 2026-04-25 16:57:01 +08:00
2dc2815620 Make harness verification portable 2026-04-25 16:37:13 +08:00
2c5e9af02a Add harness-guided tuning prompts 2026-04-25 16:35:33 +08:00
661db1e0c6 Document dash0 experiment workflow 2026-04-25 16:18:28 +08:00
dfe792ff6f docs: add q235b prefill 0-32k tight summary 2026-04-18 16:10:29 +08:00
d237fc2723 docs: expand qwen27b 0-8k compare summary 2026-04-17 20:45:24 +08:00
9919b9a7bd configs: add q235b prefill 1s 2s 0-32k study 2026-04-17 19:25:32 +08:00
34eb495b3e configs: add qwen235b prefill 0-32k study 2026-04-17 19:20:44 +08:00
bf286ef2a6 docs: add qwen235b prefill 7-day compare 2026-04-14 10:27:08 +08:00
26f3b46966 compare: add multi-candidate runner 2026-04-13 20:50:39 +08:00
18ff644b32 configs: add qwen235b prefill tight ttft 0323 study 2026-04-13 09:39:32 +08:00
bbecec4e9f docs: add qwen235b tight ttft prefill summary 2026-04-13 09:37:06 +08:00
ee9ec3c60b docs: add qwen235b decode 0323 summary 2026-04-13 09:33:02 +08:00
a1b96f7dd2 docs: update qwen27b 7-day compare 2026-04-13 09:16:31 +08:00
4625fba487 trace: make window materialization atomic 2026-04-12 23:09:30 +08:00
631a076498 trace: include weekend legacy windows 2026-04-12 22:43:02 +08:00
ade81b5549 docs: add qwen27b chat 0-8k compare summary 2026-04-12 22:39:57 +08:00
edfd61a696 Add qwen235b prefill docs and tight TTFT spec 2026-04-12 11:24:23 +08:00
3f20ddf87e Add qwen235b prefill-only tuning support 2026-04-11 21:00:02 +08:00
5e54e9c8f5 Add multi-window baseline vs tuned compare flow 2026-04-11 13:51:54 +08:00
a0b2d7eab2 Add qwen27b and qwen235b tuning notes 2026-04-11 12:07:42 +08:00
31dd44c54b Align qwen27b baseline proposal to TP1 run script 2026-04-11 00:40:05 +08:00
83325b2f76 Reset new topology groups to full binary search 2026-04-11 00:36:45 +08:00
a4d54442db Fix topology-aware incumbents for qwen27b tuning 2026-04-11 00:32:41 +08:00