Gahow Wang 255c8e6884 Hybrid routing: LMetric for LB + explicit affinity for high-cache sessions
Replace the full unified cost model with a simpler hybrid:
- If session has >50% cache on affinity instance AND instance not overloaded
  (num_requests <= avg * overload_factor) → stick to affinity
- Otherwise → use LMetric (P × BS) for best load balance

This combines LMetric's superior load balance with explicit session
affinity for high-value sessions that have significant cache accumulation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-25 09:05:08 +08:00
Description
No description provided
48 MiB
Languages
Python 82.9%
Shell 17.1%