Objectives - Auto LLM inference config tuner Key Results - [2/10] Workload grouping methods - [9/10] Build the first version auto tuner system - [8/10] Check the current situation of parallelism config optimization - [4/10] Understand the possibility/challenges in LLM inference compute graph arrangement automatically - [1/10] Define the IR for automatic optimization - [5/10] Profile different parallelism setup with real trace and analysis their difference Last Week - [KR1] Run benchmark for different workload classifications and prove that different classification method will shift the best config and different workload groups need different configs to maximize the goodput. - [misc] Prepare for IPADS group meeting presentation. - [misc] Prepare for the ChinaSys presentation. Next Week - Define the workload classification space and find the method to group workload.