Files
obsidian/projects/kvcachecache/Dev.md

33 lines
562 B
Markdown

[ServeGen: Workload Characterization and Generation of Large Language Model Serving in Production](https://arxiv.org/pdf/2505.09999)
优先 evict M queue
![[projects/kvcachecache/Dev.figs/250414-000021.png]]
| | S3FIFO |
|------------|----------|
| 1kGPU1kCPU | 0.095005 |
| 1kGPU2kCPU | 0.136413 |
| 1kGPU4kCPU | 0.213832 |
优先 evict S queue
![[projects/kvcachecache/Dev.figs/250414-000021-1.png]]
| | S3FIFO |
| ---------- | -------- |
| 1kGPU1kCPU | 0.095005 |
| 1kGPU2kCPU | 0.136413 |
| 1kGPU4kCPU | 0.213832 |