Commit Graph

103 Commits

Author SHA1 Message Date
26f3b46966 compare: add multi-candidate runner 2026-04-13 20:50:39 +08:00
18ff644b32 configs: add qwen235b prefill tight ttft 0323 study 2026-04-13 09:39:32 +08:00
bbecec4e9f docs: add qwen235b tight ttft prefill summary 2026-04-13 09:37:06 +08:00
ee9ec3c60b docs: add qwen235b decode 0323 summary 2026-04-13 09:33:02 +08:00
a1b96f7dd2 docs: update qwen27b 7-day compare 2026-04-13 09:16:31 +08:00
4625fba487 trace: make window materialization atomic 2026-04-12 23:09:30 +08:00
631a076498 trace: include weekend legacy windows 2026-04-12 22:43:02 +08:00
ade81b5549 docs: add qwen27b chat 0-8k compare summary 2026-04-12 22:39:57 +08:00
edfd61a696 Add qwen235b prefill docs and tight TTFT spec 2026-04-12 11:24:23 +08:00
3f20ddf87e Add qwen235b prefill-only tuning support 2026-04-11 21:00:02 +08:00
5e54e9c8f5 Add multi-window baseline vs tuned compare flow 2026-04-11 13:51:54 +08:00
a0b2d7eab2 Add qwen27b and qwen235b tuning notes 2026-04-11 12:07:42 +08:00
31dd44c54b Align qwen27b baseline proposal to TP1 run script 2026-04-11 00:40:05 +08:00
83325b2f76 Reset new topology groups to full binary search 2026-04-11 00:36:45 +08:00
a4d54442db Fix topology-aware incumbents for qwen27b tuning 2026-04-11 00:32:41 +08:00
06d4c380b3 Align qwen27b baseline proposal with topology study 2026-04-10 17:43:02 +08:00
8d0777e5e2 Add topology-aware qwen27b 0-8k tuning 2026-04-10 17:41:54 +08:00
b960607d8f Add qwen235b thinking decode tuning note 2026-04-10 17:33:08 +08:00
9422d43737 Prioritize topology exploration in decode tuning 2026-04-10 10:25:41 +08:00
d582a8ed1b Validate served model name consistency 2026-04-09 22:50:23 +08:00
baba1a3c4f Ignore decode study artifacts 2026-04-09 21:08:29 +08:00
ef78fe7eb5 Add topology-aware tuning constraints 2026-04-09 21:07:51 +08:00
7371d6635c Force codex stream to use chat completions 2026-04-09 14:49:40 +08:00
581ef7ccea Add qwen235b decode TPOT40 study config 2026-04-09 12:57:05 +08:00
ceafecd8f0 Fix list flag serialization for engine launch 2026-04-09 11:52:27 +08:00
c158807fac Add decode-only study mode support 2026-04-09 11:23:17 +08:00
96140b79bb Add streaming LLM proposal support 2026-04-09 01:06:45 +08:00
46151512cd Support codex reasoning effort override 2026-04-09 00:57:33 +08:00
0990a3771e Support codex responses API 2026-04-09 00:55:05 +08:00
79ba8a50c8 Repair truncated LLM proposal JSON 2026-04-07 11:38:08 +08:00
94c89e1103 Add codex and bailian LLM provider presets 2026-04-07 11:31:26 +08:00
f73a8a5767 Ignore remote tuning artifacts 2026-04-07 11:12:37 +08:00
46ed688ace Add trace length bucket tuning support 2026-04-07 11:03:16 +08:00
e9b5e9b957 Add targeted low-threshold probe specs 2026-04-05 02:08:27 +08:00
84c5d6bd80 Add deeper infeasible probe diagnostics 2026-04-05 01:44:38 +08:00
0aa607a4f1 Kill engine process groups on trial cleanup 2026-04-05 01:30:05 +08:00
e00bedb466 Stop waiting on in-flight requests after early stop 2026-04-05 00:56:26 +08:00
75a9842f1a Bypass proxies for loopback engines 2026-04-04 23:50:42 +08:00
7632de8dad Record failed trial context 2026-04-04 23:35:07 +08:00
8b024c72f1 Tighten LLM proposal schema 2026-04-04 23:24:32 +08:00
00778eff42 Harden LLM proposal parsing 2026-04-04 23:19:42 +08:00
0b7cad7da3 Normalize OpenAI base URLs 2026-04-04 23:17:17 +08:00
7e8523fdaa Add probe early stop guards 2026-04-04 22:58:33 +08:00
56fa6747d2 Add replay time scaling for smoke tuning 2026-04-04 22:40:49 +08:00
dcb972014a Enable BLADNN for dash0 fp4 smoke study 2026-04-04 22:32:55 +08:00
f192c741ed Add study tune loop and smoke configs 2026-04-04 22:29:59 +08:00
7b7eaafd78 Use time-based trace window ids 2026-04-04 22:09:43 +08:00
4e1401f50c Stream trace window materialization 2026-04-04 21:49:03 +08:00
69f666593e Speed up raw trace window extraction 2026-04-04 21:42:02 +08:00
65b122fd4b Add raw trace window preparation script 2026-04-04 21:37:51 +08:00