- GPU monitor: 5s interval nvidia-smi sampling during benchmarks - A/B test script: clean restart + monitor + benchmark for Combined vs PD-Sep - Fixed proxy: await bootstrap init (race condition), normalized LB scoring - Fixed port conflicts: proxy 9090 to avoid bootstrap 9000 clash Key finding: PD-Sep GPU utilization is 40% of Combined (12.4% vs 30.5%) - Decode GPUs: mean=7.8%, max=47% (memory-bound, compute wasted) - Prefill GPUs: active only 17% of samples (bursty, idle between requests) - Combined: 8 GPUs flexibly used, mean=30.5%, active=64% Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
19 lines
618 B
Bash
Executable File
19 lines
618 B
Bash
Executable File
#!/bin/bash
|
|
# Sample GPU utilization every 5s, output CSV
|
|
# Usage: bash gpu_monitor.sh <output_file> [interval_s]
|
|
# Runs until killed (Ctrl+C or kill)
|
|
|
|
OUT="${1:-/tmp/gpu_util.csv}"
|
|
INTERVAL="${2:-5}"
|
|
|
|
echo "timestamp,gpu,util_pct,mem_used_mb,mem_total_mb,power_w" > "$OUT"
|
|
|
|
while true; do
|
|
TS=$(date +%s.%N)
|
|
nvidia-smi --query-gpu=index,utilization.gpu,memory.used,memory.total,power.draw \
|
|
--format=csv,noheader,nounits 2>/dev/null | while IFS=', ' read -r idx util mem_used mem_total power; do
|
|
echo "$TS,$idx,$util,$mem_used,$mem_total,$power"
|
|
done >> "$OUT"
|
|
sleep "$INTERVAL"
|
|
done
|