aituner/examples at a1b804f879d86f90c009b79912726a24ed499120 - aituner - Local Gitea

gahow/aituner

Files

History

Gahow Wang a1b804f879 Ablation: search.high 0.25 -> 0.15 (skip wildly-infeasible top probes)

Smoke on the real-output substrate measured feasible sampling_u = 0.0156 (TP2)
and 0.0742 (TP4, per-GPU 0.618 = 2.24x TP2). search.high=0.25 made the binary
search waste its two top probes (u=0.125/0.0625, always infeasible, admitting the
most long-output requests) on every trial. 0.15 keeps ~2x headroom over the TP4
boundary (0.0742) and trims ~15-20% of per-trial cost with identical feasibility
results; if a runtime-tuned config ever saturates 0.15 the harness search-high
saturation stop fires (informative, not silent).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-17 22:11:52 +08:00

..

dash0_smoke_proposals

Add study tune loop and smoke configs

2026-04-04 22:29:59 +08:00

Add 27B TP A/B (deterministic ground-truth: does TP2 beat TP1 per-GPU)

2026-06-15 18:39:54 +08:00

Add study tune loop and smoke configs

2026-04-04 22:29:59 +08:00

capability.example.json

Initial AITuner study orchestrator

2026-04-04 21:26:37 +08:00

compare.example.json

Add multi-window baseline vs tuned compare flow

2026-04-11 13:51:54 +08:00

dash0_llm_10min_study_run1f.json

Tighten LLM proposal schema

2026-04-04 23:24:32 +08:00

dash0_manual_trial2_maxprobes6.json

Add deeper infeasible probe diagnostics

2026-04-05 01:44:38 +08:00

dash0_manual_trial2_probe_003125.json

Add targeted low-threshold probe specs

2026-04-05 02:08:27 +08:00

dash0_manual_trial2_probe_0015625.json

Add targeted low-threshold probe specs

2026-04-05 02:08:27 +08:00

dash0_manual_trial2_proposal.json

Add deeper infeasible probe diagnostics

2026-04-05 01:44:38 +08:00

dash0_qwen27b_ablation_harness_on.json

Ablation: search.high 0.25 -> 0.15 (skip wildly-infeasible top probes)

2026-06-17 22:11:52 +08:00

dash0_qwen27b_ablation_naive_off.json

Ablation: search.high 0.25 -> 0.15 (skip wildly-infeasible top probes)

2026-06-17 22:11:52 +08:00

dash0_qwen27b_stopB_loop.json

Add 27B Stop-B agentic-loop config (harness-driven, GPUs 2-7)

2026-06-16 09:08:46 +08:00

dash0_qwen27b_tight_slo_baseline_proposal.json

Align qwen27b baseline proposal to TP1 run script

2026-04-11 00:40:05 +08:00

dash0_qwen27b_tight_slo_run1.json

Add trace length bucket tuning support

2026-04-07 11:03:16 +08:00

dash0_qwen27b_tight_slo_run2.json

Add trace length bucket tuning support

2026-04-07 11:03:16 +08:00

dash0_qwen27b_tight_slo_run3.json

Add trace length bucket tuning support

2026-04-07 11:03:16 +08:00

dash0_qwen27b_tight_slo_run4_0_8k.json

Fix topology-aware incumbents for qwen27b tuning

2026-04-11 00:32:41 +08:00

dash0_qwen27b_tp_ab.json

Pin 27B A/B to GPUs 2-7 (route around leaked GPU0/1 memory)

2026-06-15 23:01:22 +08:00

dash0_qwen30b_a3b_community_vllm020_harness.json

Use time-compressed community vllm ablation

2026-05-02 10:03:59 +08:00

dash0_qwen30b_a3b_community_vllm020_noharness.json

Use time-compressed community vllm ablation

2026-05-02 10:03:59 +08:00

dash0_qwen30b_a3b_stopA_fulldata.json

Fix Stop-A validation config: system vllm, cap max-model-len

2026-06-15 15:22:48 +08:00

dash0_qwen30b_a3b_stopA_on.json

Add Stop-A ON config (adaptive_stop enabled + boundary guard) for A/B

2026-06-15 16:25:24 +08:00

dash0_qwen30b_a3b_stopB_e2e.json

Phase 5: widen search.high to 1.0 to force multi-iteration Stop-B convergence

2026-06-15 17:12:32 +08:00

dash0_qwen235b_decode_thinking_baseline.json

Add decode-only study mode support

2026-04-09 11:23:17 +08:00

dash0_qwen235b_decode_thinking_run1.json

Document decode harness one-shot mechanism

2026-05-02 06:25:06 +08:00

dash0_qwen235b_decode_thinking_run2_tpot40.json

Document decode harness one-shot mechanism

2026-05-02 06:25:06 +08:00

dash0_qwen235b_prefill_thinking_baseline.json

Add qwen235b prefill-only tuning support

2026-04-11 21:00:02 +08:00

dash0_qwen235b_prefill_thinking_run1_ttft.json

Add qwen235b prefill-only tuning support

2026-04-11 21:00:02 +08:00

dash0_qwen235b_prefill_thinking_run2_ttft_tight.json

Add qwen235b prefill docs and tight TTFT spec

2026-04-12 11:24:23 +08:00

dash0_qwen235b_prefill_thinking_run3_ttft_tight_0323.json

configs: add qwen235b prefill tight ttft 0323 study

2026-04-13 09:39:32 +08:00

dash0_smoke_study.json

Add probe early stop guards

2026-04-04 22:58:33 +08:00

dash1_qwen235b_prefill_thinking_7day_compare.json

compare: add multi-candidate runner

2026-04-13 20:50:39 +08:00

dash1_qwen235b_prefill_thinking_run4_ttft_tight_0_32k.json

configs: add qwen235b prefill 0-32k study

2026-04-17 19:20:44 +08:00

dash1_qwen235b_prefill_thinking_run5_ttft_1s_2s_0_32k.json

configs: add q235b prefill 1s 2s 0-32k study

2026-04-17 19:25:32 +08:00

study.example.json

Add codex and bailian LLM provider presets

2026-04-07 11:31:26 +08:00