763355b8257244a6d7c5a03fc5e2ec2892931a87
Smoke validation on dash0 surfaced three real bugs that broke interference and failure-attribution labels end-to-end: 1. endpoint_url in metrics is the proxy URL (e.g. http://h:9200); the vLLM worker URL lives in breakdown's routed_to. The interference index and label path were taking endpoint_url first, so every request looked routed to a non-existent worker and the overlap counter stayed at zero. 2. _normalize_worker hard-coded base port 8000, so a smoke run on port 9100 resolved to engine_1100 instead of engine_0. Added a --worker-map URL=engine_id CLI flag and _resolve_worker() that prefers the explicit map and falls back to the heuristic. 3. vLLM rewrites the per-step rid as cmpl-<proxy_id>-<i>-<hash>, so the str equality check between per_req rid and our proxy request_id never matched -> every prefill step looked like "other request prefill", which would have flipped overlap to 100%. Added _vllm_rid_matches() that strips the cmpl-/chatcmpl- prefix. After the fix, the same smoke run reports interference_index = 22.9 across 24 overlap / 6 clean requests on a single instance, which is the expected shape for serial dispatch into a cold engine. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Description
No description provided
Languages
Python
82.9%
Shell
17.1%