08530b3915b509301281ec2054eadd216926fed0
Documents each pick_instance_* function from cache_aware_proxy.py in pseudocode so the policy semantics can be cited without re-reading implementation details. Covers lmetric (main baseline), load_only (no cache / no affinity control), sticky (hard affinity control), unified (gated affinity + LMetric fallback), and capped (lmetric on a per-session turn-capped trace). Includes a decision matrix that maps each policy to whether it uses session affinity, cache awareness, load awareness, and overload break, plus a one-liner per control explaining what comparison isolates which factor. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Description
No description provided
Languages
Python
82.9%
Shell
17.1%