Gahow Wang 08530b3915 B3 policies: pseudocode reference for the five-policy sweep
Documents each pick_instance_* function from cache_aware_proxy.py in
pseudocode so the policy semantics can be cited without re-reading
implementation details. Covers lmetric (main baseline), load_only
(no cache / no affinity control), sticky (hard affinity control),
unified (gated affinity + LMetric fallback), and capped (lmetric on
a per-session turn-capped trace).

Includes a decision matrix that maps each policy to whether it uses
session affinity, cache awareness, load awareness, and overload
break, plus a one-liner per control explaining what comparison
isolates which factor.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 19:57:02 +08:00
Description
No description provided
48 MiB
Languages
Python 82.9%
Shell 17.1%