Add open source project metadata

2026-05-06 21:18:21 +08:00
parent c1ff64381d
commit d7df1ebdac
10 changed files with 238 additions and 4 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,72 @@
+# AITuner
+
+AITuner is a small study orchestrator for OpenAI-compatible serving engines. It
+replays trace windows, searches for the highest feasible offered load under
+configured SLOs, and records enough trial context for LLM- or harness-guided
+configuration proposals.
+
+## Status
+
+This repository is research tooling. Treat reported experiment numbers as valid
+only when the matching study spec, trial artifacts, probe history, and
+`probe_details.jsonl` files are available for audit.
+
+## Install
+
+```bash
+python3 -m pip install -e .
+```
+
+## Test
+
+The test suite uses the Python standard library `unittest` runner:
+
+```bash
+PYTHONPATH=src python3 -m unittest discover -s tests -v
+```
+
+If the package is installed in editable mode, `PYTHONPATH=src` is optional.
+
+## Basic Workflow
+
+Initialize a study:
+
+```bash
+aituner study init --spec configs/examples/study.example.json
+```
+
+Run a local tuning loop:
+
+```bash
+aituner study tune --spec configs/examples/study.example.json --max-trials 2
+```
+
+Run a compare:
+
+```bash
+aituner compare run --spec configs/examples/compare.example.json
+```
+
+Remote experiment notes for this checkout live in `AGENTS.md`. The default
+remote host is `dash0`, and code should be synchronized through Git before
+remote runs.
+
+## Experiment Integrity
+
+- Fixed-length replay requests are scored only when completion token usage is
+  verifiable and matches the trace expectation.
+- Each trial writes aggregate probe history and per-request probe details.
+- `request_rate_per_gpu` is the primary cross-topology metric:
+  `best_feasible_request_rate / (tensor_parallel_size * data_parallel_size)`.
+- Compare reports include failed and no-feasible window counts; do not interpret
+  mean request rates without those counts.
+- Bounded replays using `max_requests_per_probe`, `completion_tokens_override`,
+  or `replay_time_scale` are convergence tests for that bounded workload, not
+  production benchmarks.
+
+## Configuration Notes
+
+Example specs that use `llm.endpoint.provider=codex` resolve the endpoint from
+the local Codex configuration unless `llm.endpoint.base_url` or
+`AITUNER_CODEX_BASE_URL` is set. Public, reproducible examples should prefer an
+explicit endpoint or omit the LLM endpoint and use proposal files.