Files
obsidian/projects/auto-tuner/scrolling.md

18 lines
720 B
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

尚未发现与暴力枚举本质不同的特征i.e. 任何更为有效的发现)
How can we reliably leverage general-purpose AI models to optimize real systems under noisy measurements, hard safety constraints, and large discrete configuration spaces—while preventing hallucinated actions and ensuring reproducibility?
AI Tuner 的一些问题:
缺少背景知识:会误认为 GPU memory utilization high (~93% HBM) 是错误的,但是事实上 vllm 本身就会固定的基本吃满 GPU memory
AI Tuner 的优点:
能报告 compute-boundevidencep95 的 GPU utilization 达到 100%
能检测 scheduling 和 batching 做的不好(调整 max_num_batched_tokens 和 max_num_seqs