# AITuner AITuner is a small study orchestrator for OpenAI-compatible serving engines. It replays trace windows, searches for the highest feasible offered load under configured SLOs, and records enough trial context for LLM- or harness-guided configuration proposals. ## Status This repository is research tooling. Treat reported experiment numbers as valid only when the matching study spec, trial artifacts, probe history, and `probe_details.jsonl` files are available for audit. ## Install ```bash python3 -m pip install -e . ``` ## Test The test suite uses the Python standard library `unittest` runner: ```bash PYTHONPATH=src python3 -m unittest discover -s tests -v ``` If the package is installed in editable mode, `PYTHONPATH=src` is optional. ## Basic Workflow Initialize a study: ```bash aituner study init --spec configs/examples/study.example.json ``` Run a local tuning loop: ```bash aituner study tune --spec configs/examples/study.example.json --max-trials 2 ``` Run a compare: ```bash aituner compare run --spec configs/examples/compare.example.json ``` Remote experiment notes for this checkout live in `AGENTS.md`. The default remote host is `dash0`, and code should be synchronized through Git before remote runs. ## Experiment Integrity - Fixed-length replay requests are scored only when completion token usage is verifiable and matches the trace expectation. - Each trial writes aggregate probe history and per-request probe details. - `request_rate_per_gpu` is the primary cross-topology metric: `best_feasible_request_rate / (tensor_parallel_size * data_parallel_size)`. - Compare reports include failed and no-feasible window counts; do not interpret mean request rates without those counts. - Bounded replays using `max_requests_per_probe`, `completion_tokens_override`, or `replay_time_scale` are convergence tests for that bounded workload, not production benchmarks. ## Configuration Notes Example specs that use `llm.endpoint.provider=codex` resolve the endpoint from the local Codex configuration unless `llm.endpoint.base_url` or `AITUNER_CODEX_BASE_URL` is set. Public, reproducible examples should prefer an explicit endpoint or omit the LLM endpoint and use proposal files.