Objective - Serverless KVCache cache Key Results - Implement the workload aware policy in vLLM - Profile the workload aware policy [3/10] Last Week - Implement priority-based (calculated by our policy) evictor for both GPU and CPU sides. - Test our policy under ralative small cache memory, and get a 30% cache hit ratio and 10% performance improvement. Prove our policy is used for limited cache memory. But for the larger cache memory, our policy still need some fine-tune. Next Week - Improve our policy for larger cache memory. - Analysis new trace.