Files
agentic-pd-hybrid/tests
Gahow Wang a785b83023 test(policy): unit tests for Algorithm 1 lex scoring
Adds the project's first test suite. Covers the
score_candidate() pure function from the previous refactor
commit, validating the qualitative properties that
KVC_ROUTER_ALGORITHM.md §3.1 and §4.2 rely on.

Tests / properties:
  - determinism: same args -> same tuple
  - shape: 4-int tuple
  - primary term: overlap dominates pure sticky
  - primary term: sticky_bonus credited
  - tie-2 inflight: lower wins
  - tie-3 assigned: lower wins
  - strict lex order: sticky wins position-1 over fresh-idle
  - load_floor disabled by default
  - load_floor gated off when sticky=True
  - load_floor zero during warmup (mean=0)
  - load_floor proportional to deficit (200/100/0 at 0/50/100% load)
  - load_floor does not underflow when overloaded
  - real per-session overlap beats load_floor on warm D
  - boilerplate overlap loses to load_floor on cold D
    (the cold-D fix from E1_E2_FIX_DESIGN §Q2)

Test infrastructure:
  - tests/ package with README explaining the GPU-free
    scope and the run instruction
  - pyproject.toml [dependency-groups] test = [pytest>=8]
    (install via `uv sync --group test`)
  - pyproject.toml [tool.pytest.ini_options] sets testpaths

Verified locally: 14/14 passing under pytest 9.0.3 in an
isolated 3.13 venv. No SGLang / GPU touched.
2026-05-12 23:54:48 +08:00
..

Tests

Pure-Python unit + property tests for the algorithm layer. These tests do not import SGLang and do not need a GPU — they validate the routing algorithm (Algorithm 1/2/3 in docs/KVC_ROUTER_ALGORITHM.md) and its theorems against the pure functions extracted from policies.py.

Run

uv sync --group test
uv run pytest

Or, without uv:

pip install pytest
PYTHONPATH=src pytest tests

Scope

  • test_policy_scoring.py — Algorithm 1 lex-score properties (overlap dominates sticky, load-floor gating, tie-breakers).
  • test_no_starvation.py — Theorem 1: bounded retries before some D either accepts or the least-rejected D is forced through the degenerate path.

Future:

  • block-level eviction MockRadixCache tests (see docs/BLOCK_LEVEL_EVICTION_DESIGN_ZH.md §5).
  • D→P sync staleness_budget property tests (see docs/D_TO_P_SYNC_CONTRACT_ZH.md §1).

Why no integration tests here

Anything that needs SGLang, mooncake, or a real model is an integration test and must run on hardware. Those tests live as scripts/sweep_*.sh under the evaluation protocol in docs/EVALUATION_PROTOCOL_ZH.md.