Objectives - Analysis of QWen trace - Customize vLLM(Ali ver) with new features - Port XPURemoting to PhOS Key Results - Enhance QWen trace's workloads separation - Get vLLM KVCache hit rate for different open source workloads - Build unified docker image for XPURemoting and PhOS Last Week - Get a unified workload taxonomy for QWen trace in both Web and App ends. - Run vLLM(Ali ver) and start to customize to get some features(e.x. KVCache hit rate for different workloads). - Build a new docker image to satisfy PhOS's base image requirement with XPURemoting env(static linked PyTorch 1.13.1). Next Week - Customize vLLM to support new features like KVCache schedule policy comparation.