xserv/.gitignore at 7cb9ee3870f1e6b9ff3d7080b07a572fdacf2ac2 - xserv - Local Gitea

gahow/xserv

Files

Gahow Wang 49c7653222 tools: add llama.cpp comparison baseline + standard benchmark suite

Vendor llama.cpp as a submodule pinned to b9371 and add a one-click
benchmark driver that compares xserv against it on identical workloads:

- setup-llama-cpp.sh: network-optional CUDA build (SM120); convert-to-gguf.sh
  converts the same safetensors to BF16 GGUF for an apples-to-apples baseline.
- tools/bench/: black-box OpenAI-API driver measuring TTFT/TPOT/throughput
  (single-stream + concurrent) and response quality on AIME 2025 + GSM8K.
- fetch_datasets.py pulls datasets to local JSON (GPU host has no network);
  task loaders prefer the local JSON.
- sync-and-build.sh: `bench` subcommand transfers source + datasets to the
  GPU host via tar-over-ssh (no rsync there), builds, and runs the suite.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-28 11:18:52 +08:00

22 lines

362 B

Plaintext

Raw Blame History

 /target
 *.o
 *.so
 *.a
 *.ptx
 *.cubin
 **/*.rs.bk
 .env
 *.npy
 # llama.cpp baseline (cloned/submoduled by tools/setup-llama-cpp.sh)
 /third_party/llama.cpp/build/
 /third_party/llama.cpp/models/
 *.gguf
 # Benchmark output + fetched datasets (transferred to GPU host, not committed)
 /bench-out/
 /tools/bench/data/
 /tools/bench/__pycache__/
 /tools/bench/**/__pycache__/