Files
obsidian/phd/weekly-report/24/241110.md

588 B

Objectives

  • Analysis of QWen trace
  • Customize vLLM(Ali ver) with new features

Key Results

  • Tokenize Qwen trace with Qwen-agent and some other tools
  • Profile Qwen trace with different cache blocks

Last Week

  • Use Qwen-agent to handle all workloads in Qwen trace and get a precise token stream to simulate actual online environment.
  • Measure the performance and KVCache cache hit rate for different cache blocks using real Qwen trace running for one hour.

Next Week

  • Check the tokenize results from Qwen trace, maybe need to modify.
  • Measure KV cache performance with CPU memory.