Commit Graph

45 Commits

Author SHA1 Message Date
edfd61a696 Add qwen235b prefill docs and tight TTFT spec 2026-04-12 11:24:23 +08:00
3f20ddf87e Add qwen235b prefill-only tuning support 2026-04-11 21:00:02 +08:00
5e54e9c8f5 Add multi-window baseline vs tuned compare flow 2026-04-11 13:51:54 +08:00
a0b2d7eab2 Add qwen27b and qwen235b tuning notes 2026-04-11 12:07:42 +08:00
31dd44c54b Align qwen27b baseline proposal to TP1 run script 2026-04-11 00:40:05 +08:00
83325b2f76 Reset new topology groups to full binary search 2026-04-11 00:36:45 +08:00
a4d54442db Fix topology-aware incumbents for qwen27b tuning 2026-04-11 00:32:41 +08:00
06d4c380b3 Align qwen27b baseline proposal with topology study 2026-04-10 17:43:02 +08:00
8d0777e5e2 Add topology-aware qwen27b 0-8k tuning 2026-04-10 17:41:54 +08:00
b960607d8f Add qwen235b thinking decode tuning note 2026-04-10 17:33:08 +08:00
9422d43737 Prioritize topology exploration in decode tuning 2026-04-10 10:25:41 +08:00
d582a8ed1b Validate served model name consistency 2026-04-09 22:50:23 +08:00
baba1a3c4f Ignore decode study artifacts 2026-04-09 21:08:29 +08:00
ef78fe7eb5 Add topology-aware tuning constraints 2026-04-09 21:07:51 +08:00
7371d6635c Force codex stream to use chat completions 2026-04-09 14:49:40 +08:00
581ef7ccea Add qwen235b decode TPOT40 study config 2026-04-09 12:57:05 +08:00
ceafecd8f0 Fix list flag serialization for engine launch 2026-04-09 11:52:27 +08:00
c158807fac Add decode-only study mode support 2026-04-09 11:23:17 +08:00
96140b79bb Add streaming LLM proposal support 2026-04-09 01:06:45 +08:00
46151512cd Support codex reasoning effort override 2026-04-09 00:57:33 +08:00
0990a3771e Support codex responses API 2026-04-09 00:55:05 +08:00
79ba8a50c8 Repair truncated LLM proposal JSON 2026-04-07 11:38:08 +08:00
94c89e1103 Add codex and bailian LLM provider presets 2026-04-07 11:31:26 +08:00
f73a8a5767 Ignore remote tuning artifacts 2026-04-07 11:12:37 +08:00
46ed688ace Add trace length bucket tuning support 2026-04-07 11:03:16 +08:00
e9b5e9b957 Add targeted low-threshold probe specs 2026-04-05 02:08:27 +08:00
84c5d6bd80 Add deeper infeasible probe diagnostics 2026-04-05 01:44:38 +08:00
0aa607a4f1 Kill engine process groups on trial cleanup 2026-04-05 01:30:05 +08:00
e00bedb466 Stop waiting on in-flight requests after early stop 2026-04-05 00:56:26 +08:00
75a9842f1a Bypass proxies for loopback engines 2026-04-04 23:50:42 +08:00
7632de8dad Record failed trial context 2026-04-04 23:35:07 +08:00
8b024c72f1 Tighten LLM proposal schema 2026-04-04 23:24:32 +08:00
00778eff42 Harden LLM proposal parsing 2026-04-04 23:19:42 +08:00
0b7cad7da3 Normalize OpenAI base URLs 2026-04-04 23:17:17 +08:00
7e8523fdaa Add probe early stop guards 2026-04-04 22:58:33 +08:00
56fa6747d2 Add replay time scaling for smoke tuning 2026-04-04 22:40:49 +08:00
dcb972014a Enable BLADNN for dash0 fp4 smoke study 2026-04-04 22:32:55 +08:00
f192c741ed Add study tune loop and smoke configs 2026-04-04 22:29:59 +08:00
7b7eaafd78 Use time-based trace window ids 2026-04-04 22:09:43 +08:00
4e1401f50c Stream trace window materialization 2026-04-04 21:49:03 +08:00
69f666593e Speed up raw trace window extraction 2026-04-04 21:42:02 +08:00
65b122fd4b Add raw trace window preparation script 2026-04-04 21:37:51 +08:00
b33d1356e7 Honor exact output lengths in replay requests 2026-04-04 21:33:26 +08:00
647d241725 Support length-only trace windows 2026-04-04 21:31:11 +08:00
gahow
cdcca1d9d7 Initial AITuner study orchestrator 2026-04-04 21:26:37 +08:00