Add qwen27b and qwen235b tuning notes

This commit is contained in:
2026-04-11 12:07:42 +08:00
parent 31dd44c54b
commit a0b2d7eab2
2 changed files with 103 additions and 0 deletions

View File

@@ -2,6 +2,26 @@
qwen3-235b-a22b `thinking` trace, `decode_only` mode, internal vLLM (`/usr/local/bin/vllm`), SLO: `p95-equivalent pass target 95%`, `TPOT <= 40ms`, `TTFT` not enforced.
## Setup
- Hardware: `dash0`, `8x H20`
- Model: `/home/admin/resource/model/464482ce.qwen3-235b-a22b/256k-0717`
- Engine: internal vLLM, decode-only mode with `--kv-transfer-config {"kv_connector":"DecodeBenchConnector","kv_role":"kv_both"}`
- Baseline topology: `TP=4, DP=2, EP=8`
- Trace: `thinking_w20260327_1000`
- Trace source: `trace_windows/traces/thinking_w20260327_1000.jsonl`
- Window duration: `600s` (`10:00-10:10`, `2026-03-27`)
- Request mode: `decode_only`
- SLO:
- pass target: `95%`
- `TPOT <= 40ms`
- `TTFT` not enforced
- Search:
- `sampling_u in [0, 0.125]`
- `max_probes = 6`
- `12` trials total
- Proposal model: `codex / gpt-5.4`
## Run assets
- Study root: `/home/admin/cpfs/wjh/aituner/aituner/.aituner-decode/dash0-qwen235b-decode-thinking-run5-tpot40-topology`