Files
agentic-kvc/microbench/fresh_setup/gpu_monitor.sh
Gahow Wang 3f997fda14 MB5 PD ablation v2 tooling: conc completion-panel plot + gpu_monitor dep
- plot_pd_crossover.py fig_conc: lead with request-completion % (the honest
  collapse signal; latency percentiles count successes only), then mean-E2E /
  TPS; note PD-capped/colo-uncapped in the title.
- add microbench/fresh_setup/gpu_monitor.sh (referenced by the committed
  mb5_run_gpu.sh:73 for per-GPU util collection).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 09:35:05 +08:00

19 lines
618 B
Bash
Executable File

#!/bin/bash
# Sample GPU utilization every 5s, output CSV
# Usage: bash gpu_monitor.sh <output_file> [interval_s]
# Runs until killed (Ctrl+C or kill)
OUT="${1:-/tmp/gpu_util.csv}"
INTERVAL="${2:-5}"
echo "timestamp,gpu,util_pct,mem_used_mb,mem_total_mb,power_w" > "$OUT"
while true; do
TS=$(date +%s.%N)
nvidia-smi --query-gpu=index,utilization.gpu,memory.used,memory.total,power.draw \
--format=csv,noheader,nounits 2>/dev/null | while IFS=', ' read -r idx util mem_used mem_total power; do
echo "$TS,$idx,$util,$mem_used,$mem_total,$power"
done >> "$OUT"
sleep "$INTERVAL"
done