Add open source project metadata
This commit is contained in:
72
README.md
Normal file
72
README.md
Normal file
@@ -0,0 +1,72 @@
|
||||
# AITuner
|
||||
|
||||
AITuner is a small study orchestrator for OpenAI-compatible serving engines. It
|
||||
replays trace windows, searches for the highest feasible offered load under
|
||||
configured SLOs, and records enough trial context for LLM- or harness-guided
|
||||
configuration proposals.
|
||||
|
||||
## Status
|
||||
|
||||
This repository is research tooling. Treat reported experiment numbers as valid
|
||||
only when the matching study spec, trial artifacts, probe history, and
|
||||
`probe_details.jsonl` files are available for audit.
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
python3 -m pip install -e .
|
||||
```
|
||||
|
||||
## Test
|
||||
|
||||
The test suite uses the Python standard library `unittest` runner:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src python3 -m unittest discover -s tests -v
|
||||
```
|
||||
|
||||
If the package is installed in editable mode, `PYTHONPATH=src` is optional.
|
||||
|
||||
## Basic Workflow
|
||||
|
||||
Initialize a study:
|
||||
|
||||
```bash
|
||||
aituner study init --spec configs/examples/study.example.json
|
||||
```
|
||||
|
||||
Run a local tuning loop:
|
||||
|
||||
```bash
|
||||
aituner study tune --spec configs/examples/study.example.json --max-trials 2
|
||||
```
|
||||
|
||||
Run a compare:
|
||||
|
||||
```bash
|
||||
aituner compare run --spec configs/examples/compare.example.json
|
||||
```
|
||||
|
||||
Remote experiment notes for this checkout live in `AGENTS.md`. The default
|
||||
remote host is `dash0`, and code should be synchronized through Git before
|
||||
remote runs.
|
||||
|
||||
## Experiment Integrity
|
||||
|
||||
- Fixed-length replay requests are scored only when completion token usage is
|
||||
verifiable and matches the trace expectation.
|
||||
- Each trial writes aggregate probe history and per-request probe details.
|
||||
- `request_rate_per_gpu` is the primary cross-topology metric:
|
||||
`best_feasible_request_rate / (tensor_parallel_size * data_parallel_size)`.
|
||||
- Compare reports include failed and no-feasible window counts; do not interpret
|
||||
mean request rates without those counts.
|
||||
- Bounded replays using `max_requests_per_probe`, `completion_tokens_override`,
|
||||
or `replay_time_scale` are convergence tests for that bounded workload, not
|
||||
production benchmarks.
|
||||
|
||||
## Configuration Notes
|
||||
|
||||
Example specs that use `llm.endpoint.provider=codex` resolve the endpoint from
|
||||
the local Codex configuration unless `llm.endpoint.base_url` or
|
||||
`AITUNER_CODEX_BASE_URL` is set. Public, reproducible examples should prefer an
|
||||
explicit endpoint or omit the LLM endpoint and use proposal files.
|
||||
Reference in New Issue
Block a user