e532e83d3e38d26dee88cb0686a4625b467f13ad
Per-port vllm:prefix_cache_{queries,hits}_total -> instance_apc.txt. For PD
this is the only honest reuse signal: producer ports show cross-turn prefix
hits, while the consumer's per-request cached_tokens just counts transferred KV.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Description
No description provided
Languages
Python
82.9%
Shell
17.1%