traces/w600_r0.0015_st30_first600s.jsonl: first-600s cut of the shipped w600 trace (807 reqs, 274 sessions, all turn-1s + early later-turns; theoretical APC ceiling ~70% vs 80% full). Faster iteration (~18 min/arm) but a colder, lower-locality regime; whitelisted alongside the parent anonymized trace. analysis/lpwl_5policy_600s.md: LPWL vs LMetric/sticky/unified/unified+A+B on the 600s trace (dash1 8xH20, cold APC, n=1). LPWL is overall best with zero knobs — TTFT p90 7983ms vs tuned A+B 11562 (-31%), E2E p90 -16%, best request balance; APC 0.648 (emergent affinity, far above LMetric 0.507); only loss is E2E p99 from heavy-class decode concentration. Demonstrates anti-overfit: A+B was tuned on full w600 yet is beaten by the knob-free policy on this regime. Includes the run_5policy_600s.sh repro driver. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
37 lines
1.7 KiB
Bash
37 lines
1.7 KiB
Bash
#!/usr/bin/env bash
|
|
# 5-policy comparison on the first-600s trace, fresh vLLM (cold APC) each arm.
|
|
# leastwork LPWL (parameter-free)
|
|
# unified_ab unified hybrid, A+B' tuned (of=1.3, lmw=0.01)
|
|
# unified_def unified hybrid, defaults (of=2.0, lmw=0.0)
|
|
# lmetric P_tokens x BS, no affinity
|
|
# sticky hard session affinity
|
|
set -uo pipefail
|
|
export ROOT=/home/admin/cpfs/wjh/agentic-kv
|
|
export MODEL=/home/admin/cpfs/wjh/models/Qwen/Qwen3-Coder-30B-A3B-Instruct
|
|
TRACE="${TRACE:-$ROOT/traces/w600_r0.0015_st30_first600s.jsonl}"
|
|
DATE="$(date +%Y%m%d_%H%M)"
|
|
OUT="${OUTROOT:-$ROOT/outputs/policy5_600s_$DATE}"
|
|
mkdir -p "$OUT"
|
|
echo "OUTROOT=$OUT" | tee "$OUT/RUNINFO.txt"
|
|
echo "TRACE=$TRACE ($(wc -l < "$TRACE") reqs)" | tee -a "$OUT/RUNINFO.txt"
|
|
date | tee -a "$OUT/RUNINFO.txt"
|
|
|
|
run_arm() { # name policy extra_args
|
|
local name="$1" policy="$2" extra="$3"
|
|
echo "===== ARM: $name (policy=$policy args='$extra') =====" | tee -a "$OUT/RUNINFO.txt"
|
|
local t0=$(date +%s)
|
|
EXTRA_PROXY_ARGS="$extra" bash "$ROOT/scripts/b3_isolated_policy.sh" \
|
|
"$policy" "$TRACE" "$OUT/$name" > "$OUT/$name.log" 2>&1
|
|
echo "$name rc=$? rows=$(wc -l < "$OUT/$name/metrics.jsonl" 2>/dev/null) dur=$(( $(date +%s) - t0 ))s" | tee -a "$OUT/RUNINFO.txt"
|
|
}
|
|
|
|
# Headline contrasts first (so a late failure still leaves the key arms).
|
|
run_arm leastwork leastwork ""
|
|
run_arm unified_ab unified "--overload-factor 1.3 --lmetric-decode-weight 0.01"
|
|
run_arm unified_def unified "--overload-factor 2.0 --lmetric-decode-weight 0.0"
|
|
run_arm lmetric lmetric ""
|
|
run_arm sticky sticky ""
|
|
|
|
echo "===== ALL DONE: $OUT =====" | tee -a "$OUT/RUNINFO.txt"
|
|
date | tee -a "$OUT/RUNINFO.txt"
|