Initial commit: obsidian to gitea
This commit is contained in:
15
phd/weekly-report/25/250427.md
Normal file
15
phd/weekly-report/25/250427.md
Normal file
@@ -0,0 +1,15 @@
|
||||
Objective
|
||||
- Serverless KVCache cache
|
||||
|
||||
Key Result
|
||||
- Refine cache policy implementation
|
||||
- Implement and test our workload-aware cache policy in vLLM
|
||||
- Write graduation thesis
|
||||
|
||||
Last Week
|
||||
- Refine cache policy to consider the _cost_ of keeping cache in memory, and get about 1% to 2% hit rate improvement under 1k+1k cache blocks.
|
||||
- Implement PDF-based workload-aware cache policy in vLLM and profile LRU v.s. WA under Qwen2-7B, get 25% QTTFT reduction.
|
||||
- Finish the first draft of graduation thesis.
|
||||
|
||||
Next Week
|
||||
- Do full test for different cache policies and under different models.
|
||||
Reference in New Issue
Block a user