Files
obsidian/phd/weekly-report/24/241103.md

754 B

Objectives

  • Analysis of QWen trace
  • Customize vLLM(Ali ver) with new features

Key Results

  • Tokenize Qwen trace with Qwen-agent and some other tools [60%]
  • Modify vLLM to support different KV cache block number
  • Profile open source dataset with different cache blocks

Last Week

  • Use Qwen-agent to handle workloads with file, get a more precise token length for these workloads.
  • Modify vLLM's cache manager to support specific KVCache cache blocks, then measure the KV cache hit rate trend by block number in different workloads.

Next Week

  • Tokenize all Qwen trace especially multimodal (image) workloads and measure with these trace.
  • Profile KVCache cache hit rate in actual trace and compare with other open source trace to find different.