Initial commit: obsidian to gitea
This commit is contained in:
14
phd/weekly-report/24/241117.md
Normal file
14
phd/weekly-report/24/241117.md
Normal file
@@ -0,0 +1,14 @@
|
||||
Objective
|
||||
- Customize vLLM(Ali ver) with new features
|
||||
|
||||
Key Results
|
||||
- Test modified vLLM which supports CPU KV cache
|
||||
- Profile and breakdown modified vLLM in synthetic data and real Qwen trace
|
||||
|
||||
Last Week
|
||||
- Merge vLLM which supports CPU KV cache and use synthetic data and real Qwen trace to measure the performance and find bugs.
|
||||
- Add a breakdown measurement support in vLLM server side to measure the time for copying of KV blocks.
|
||||
|
||||
Next Week
|
||||
- Run more test for vLLM which supports CPU KV cache.
|
||||
- Try to optimize current implementation.
|
||||
Reference in New Issue
Block a user