|
|
9bb5c5c328
|
tools: add correctness + performance test scripts for Qwen3-8B
- test_correctness.py: compare prefill logits top-20 vs HF transformers
- bench_server.py: HTTP API benchmark (throughput, streaming, concurrent, EOS leak check)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-05-23 14:13:49 +08:00 |
|