Initial commit: obsidian to gitea
This commit is contained in:
14
phd/weekly-report/24/241229.md
Normal file
14
phd/weekly-report/24/241229.md
Normal file
@@ -0,0 +1,14 @@
|
||||
Objective
|
||||
- Serverless KVCache cache
|
||||
|
||||
Key Results
|
||||
- Implement the workload aware policy in vLLM
|
||||
- Profile the workload aware policy [3/10]
|
||||
|
||||
Last Week
|
||||
- Implement priority-based (calculated by our policy) evictor for both GPU and CPU sides.
|
||||
- Test our policy under ralative small cache memory, and get a 30% cache hit ratio and 10% performance improvement. Prove our policy is used for limited cache memory. But for the larger cache memory, our policy still need some fine-tune.
|
||||
|
||||
Next Week
|
||||
- Improve our policy for larger cache memory.
|
||||
- Analysis new trace.
|
||||
Reference in New Issue
Block a user