Objectives - Auto LLM inference config tuner Key Results - [2/10] Workload grouping methods - [9/10] Build the first version auto tuner system - [8/10] Check the current situation of parallelism config optimization - [4/10] Understand the possibility/challenges in LLM inference compute graph arrangement automatically - [1/10] Define the IR for automatic optimization - [5/10] Profile different parallelism setup with real trace and analysis their difference Last Week - [KR2] Run AI Tuner under the current Ali workload groups (input length / label), and try to find the insights for building a better AI Tuner. - [misc] Build system for EuroSys Shadow experiment. Next Week - Compare the AI Tuner results with Ali's current situation to find more insights for AI Tuner.