16 lines
588 B
Markdown
16 lines
588 B
Markdown
Objectives
|
|
- Analysis of QWen trace
|
|
- Customize vLLM(Ali ver) with new features
|
|
|
|
Key Results
|
|
- Tokenize Qwen trace with Qwen-agent and some other tools
|
|
- Profile Qwen trace with different cache blocks
|
|
|
|
Last Week
|
|
- Use Qwen-agent to handle all workloads in Qwen trace and get a precise token stream to simulate actual online environment.
|
|
- Measure the performance and KVCache cache hit rate for different cache blocks using real Qwen trace running for one hour.
|
|
|
|
Next Week
|
|
- Check the tokenize results from Qwen trace, maybe need to modify.
|
|
- Measure KV cache performance with CPU memory.
|