Objective
- Serverless KVCache cache

Key Results
- Implement the workload aware policy in vLLM
- Profile the workload aware policy [3/10]

Last Week
- Implement priority-based (calculated by our policy) evictor for both GPU and CPU sides.
- Test our policy under ralative small cache memory, and get a 30% cache hit ratio and 10% performance improvement. Prove our policy is used for limited cache memory. But for the larger cache memory, our policy still need some fine-tune.

Next Week
- Improve our policy for larger cache memory.
- Analysis new trace.