Explore Help

gahow/aituner

0

0

You've already forked aituner

Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity

Files

eb137a0b6247b9b8117198f437df529ebd1e692a

aituner/docs/superpowers/plans/2026-05-06-repo-audit-repair.md

Gahow Wang d7df1ebdac

Some checks failed

CI / test (3.11) (push) Has been cancelled

Details

CI / test (3.12) (push) Has been cancelled

Details

Add open source project metadata

2026-05-06 21:18:21 +08:00

2.7 KiB

Raw Blame History

Repo Audit Repair Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Repair the audit findings that affect measurement integrity, state correctness, documentation accuracy, and open-source readiness.

Architecture: Keep changes localized to the existing stdlib-only Python package. Measurement validation lives at the HTTP/worker boundary, study state fixes remain in StudyStore, compare reporting gains explicit failed/no-feasible accounting, and project metadata/docs are added at repo root.

Tech Stack: Python 3.11+ stdlib, unittest, setuptools pyproject.toml.

Task 1: Measurement Integrity

Files:

Modify: src/aituner/http_client.py
Modify: src/aituner/slo.py
Modify: src/aituner/worker.py
Test: tests/test_core_flow.py
Write failing tests for completion token source/mismatch failures and persisted per-request probe details.
Run the targeted tests and confirm they fail for the expected reason.
Add token source metadata to streamed metrics and request outcomes.
Fail requests when configured completion length cannot be verified from usage or differs from expected.
Persist probe outcome details under each trial artifact directory.
Run targeted tests and the full unittest suite.

Task 2: State, Spec, And Compare Guards

Files:

Modify: src/aituner/spec.py
Modify: src/aituner/store.py
Modify: src/aituner/compare.py
Modify: scripts/run_multi_compare.py
Test: tests/test_core_flow.py
Write failing tests for state list isolation, invalid trace numeric bounds, and compare aggregate failure accounting.
Run targeted tests and confirm expected failures.
Deep-copy/replace trial lists when materializing trials.
Validate positive trace controls in TraceSpec.from_dict.
Report failed/no-feasible counts in compare aggregates without changing existing winner semantics.
Run targeted tests and the full unittest suite.

Task 3: Docs And Open-Source Readiness

Files:

Create: README.md
Create: LICENSE
Create: CONTRIBUTING.md
Create: SECURITY.md
Modify: pyproject.toml
Modify: selected docs under docs/
Add concise repo usage, verification, and experiment integrity guidance.
Add MIT license and contribution/security notes.
Add project metadata and optional test extra.
Update stale docs about high-stop behavior and current test count.
Run JSON validation and full unittest suite.
Commit changes in logical groups.

Powered by Gitea Version: 1.24.7 Page: 15ms Template: 0ms

English

Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語简体中文繁體中文（台灣）繁體中文（香港） 한국어

Licenses API