xserv/.gitignore at c679f618fd8f4ed1adaf3fbc49633d89ec66e683 - xserv - Local Gitea

gahow/xserv

Files

Gahow Wang 3f1c3d429a docs: llama.cpp vs xserv benchmark results + summary

Record what the new baseline adds (llama.cpp pinned b9371, same BF16 weights,
AIME 2025 + GSM8K) and the measured results: performance (xserv ~0.45-0.61x
llama.cpp throughput) and quality parity (GSM8K 94% vs 96%, AIME 23.3% vs 20%
after the context fix), plus the findings the bench surfaced.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-28 15:06:21 +08:00

23 lines

382 B

Plaintext

Raw Blame History

 /target
 *.o
 *.so
 *.a
 *.ptx
 *.cubin
 **/*.rs.bk
 .env
 *.npy
 # llama.cpp baseline (cloned/submoduled by tools/setup-llama-cpp.sh)
 /third_party/llama.cpp/build/
 /third_party/llama.cpp/models/
 *.gguf
 # Benchmark output + fetched datasets (transferred to GPU host, not committed)
 /bench-out/
 /tools/bench/data/
 /tools/__pycache__/
 /tools/bench/__pycache__/
 /tools/bench/**/__pycache__/