Initial commit: obsidian to gitea
This commit is contained in:
17
projects/auto-tuner/scrolling.md
Normal file
17
projects/auto-tuner/scrolling.md
Normal file
@@ -0,0 +1,17 @@
|
||||
尚未发现与暴力枚举本质不同的特征(i.e. 任何更为有效的发现)
|
||||
|
||||
How can we reliably leverage general-purpose AI models to optimize real systems under noisy measurements, hard safety constraints, and large discrete configuration spaces—while preventing hallucinated actions and ensuring reproducibility?
|
||||
|
||||
|
||||
AI Tuner 的一些问题:
|
||||
|
||||
缺少背景知识:会误认为 GPU memory utilization high (~93% HBM) 是错误的,但是事实上 vllm 本身就会固定的基本吃满 GPU memory
|
||||
|
||||
AI Tuner 的优点:
|
||||
|
||||
能报告 compute-bound(evidence:p95 的 GPU utilization 达到 100%),
|
||||
能检测 scheduling 和 batching 做的不好(调整 max_num_batched_tokens 和 max_num_seqs)
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user