obsidian/250427.md at a57afa86b47c58aeca557e7cbcb0d38b81159d78 - obsidian - Local Gitea

gahow/obsidian

Files

Gahow Wang a57afa86b4 Initial commit: obsidian to gitea

2026-05-07 15:04:41 +08:00

579 B

Raw Blame History

Objective

Serverless KVCache cache

Key Result

Refine cache policy implementation
Implement and test our workload-aware cache policy in vLLM
Write graduation thesis

Last Week

Refine cache policy to consider the cost of keeping cache in memory, and get about 1% to 2% hit rate improvement under 1k+1k cache blocks.
Implement PDF-based workload-aware cache policy in vLLM and profile LRU v.s. WA under Qwen2-7B, get 25% QTTFT reduction.
Finish the first draft of graduation thesis.

Next Week

Do full test for different cache policies and under different models.