Gahow Wang e13391eeab Evict migrated blocks from prefix cache after KV send completes
After a session migrates from C to D via offload, C's blocks were freed
to the LRU tail (most-recently-used position), making them the last to
be evicted. Since the session won't return to C, these blocks are dead
weight occupying cache capacity.

Now capture block IDs before _free_blocks and call evict_blocks to
remove them from the prefix cache hash table, so they can be reused
sooner for active sessions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-24 16:56:34 +08:00
Description
No description provided
48 MiB
Languages
Python 82.9%
Shell 17.1%