Reuse and concurrency axes redone with proper controlled variables, plus
the orchestration used to run them on dash0:
- run_reuse_fixed.sh: hold REAL prefill work (delta) constant, vary only
cached prefix -> reuse = C/(C+U). Supersedes old fig1 (which held
input=8192 and sliced prefix out, confounding "more reuse" with "less
prefill").
- run_conc.sh: agentic-corner config (in=32768, delta=512, reuse=0.984,
out=128) that exposes PD's structural KV-transfer tax. Supersedes old fig3.
- run_campaign{,2,3}.sh, backfill_d2048o128.sh: serial campaign drivers
(strictly one driver at a time), out=128 sweeps, PD wall-cap for
collapse-draining high-reuse arms, and flaked-arm backfill.
- mb5_run_gpu.sh: per-config bring-up / replay / teardown orchestrator.
- plot_pd_crossover.py: render the reuse_compare figures from fig_agg dumps.
- fig_agg.py: tolerate null stats from fully-collapsed arms (0 successes
write the stat keys as null; `dict.get(k, {})` returns null, not {}).
Data: fig1_reuse_fixed.json, fig1_reuse_d{1024,2048}_o128.json
Figs: reuse_compare_AB.png, reuse_compare_ABC.png
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
27 lines
1.3 KiB
Bash
27 lines
1.3 KiB
Bash
#!/usr/bin/env bash
|
|
# Campaign 2 (2026-05-31): two extra reuse sweeps at out=128 (user request:
|
|
# delta=1024/out=128 and delta=2048/out=128), then the capped conc restart.
|
|
# STRICTLY one driver at a time; reuse sweeps run uncapped (mild collapse, matches
|
|
# the existing d2048/o256 sweep), conc runs with the PD-arm wall-cap. NO set -e.
|
|
cd /home/admin/cpfs/wjh/agentic-kv-fresh
|
|
export MB5_VENV="${MB5_VENV:-/home/admin/cpfs/wjh/agentic-kv-fresh/.venv_dash0}"
|
|
FS=microbench/fresh_setup
|
|
|
|
echo "=== CAMPAIGN2 START $(date) ==="
|
|
|
|
echo "=== [1/3] REUSE delta=1024 out=128 (reuse 0.33-0.97) $(date) ==="
|
|
DELTA=1024 OL=128 bash "$FS/run_reuse_fixed.sh"; rc1=$?
|
|
echo "=== reuse d1024 o128 rc=$rc1 $(date) ==="
|
|
sleep 12; nvidia-smi --query-gpu=index,memory.used --format=csv,noheader | head -8
|
|
|
|
echo "=== [2/3] REUSE delta=2048 out=128 (reuse 0.20-0.95) $(date) ==="
|
|
DELTA=2048 OL=128 bash "$FS/run_reuse_fixed.sh"; rc2=$?
|
|
echo "=== reuse d2048 o128 rc=$rc2 $(date) ==="
|
|
sleep 12; nvidia-smi --query-gpu=index,memory.used --format=csv,noheader | head -8
|
|
|
|
echo "=== [3/3] CONC capped (PD wall=${CONC_PD_MAXDUR:-600}s, colo uncapped), N 8..128 $(date) ==="
|
|
NLIST="8 16 32 48 64 96 128" bash "$FS/run_conc.sh"; rc3=$?
|
|
echo "=== conc rc=$rc3 $(date) ==="
|
|
|
|
echo "=== CAMPAIGN2 DONE reuse_d1024_o128=$rc1 reuse_d2048_o128=$rc2 conc=$rc3 $(date) ==="
|