Files

39 lines
920 B
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

if sent_method not in [
+ "determine_num_available_blocks",
+ "initialize_cache",
+ ]:
ray 2.46.0 -> 2.47.1
ipconfig -a
`VLLM_USE_PRECOMPILED=1 pip install --editable .`
```
[Credentials]
language=EN
endpoint=oss-cn-hangzhou.aliyuncs.com
accessKeyID=LTAIJO7wLG9y8KJH
accessKeySecret=nbx8fIu9B94JoICuKRBhxfSQsMgYeY
```
---
基于 Qwen3-30B128 experts, 48 layers, activate 8 experts的测试来看
- 每一层的 expert activation 并没有做到负载均衡std/mean 的值都接近 1
- 最后几层的 std 明显比前面层的 std 大
TBD
- [ ] 不同 workload 的 expert activation 是否有显著区别
- [ ] 相邻层的 expert activation 是否有关联
- [ ] temporal pattern 和全局的关联
- [ ] 理解 EP 浴盆曲线
- [ ] 列个表survey 现有工作的 points和我们测试的对比
- [ ] reasoning 与 non reasoning 在同一个 session 混合