Files
agentic-kvc/analysis/characterization/current_results/reproduction_commands.sh
Gahow Wang 4722883903 Audit package refresh: Window 1 supported claims + risk register
Refresh the standing audit package now that B1' / B2 / B3 are complete.

current_results/characterization_claim_matrix.md
  Flips seven entries from "not_yet_supported" / "partially_supported"
  to "supported" with pointers into window_1_results/. New entries
  cover per-session sequentiality, KV per request, real reuse
  decomposition, theoretical APC ceiling, the LMetric locality gap,
  Unified breaking the locality-vs-latency tradeoff, B2 causal
  interference proof, sticky's interference inflation, and the
  partial heavy-tail / hot-spot story. B4 SRR + B5 attribution stay
  "not_yet_supported" (Window 2 work).

current_results/main_claim_allowed_runs.md
  New "Allowed For Routing-Policy Comparison" section pins the five
  B3 policy directories. New "Allowed For PD-colo Interference"
  section pins the B2 sweep. Legacy section retained for the
  pre-instrumentation 200/500/1000-req runs.

current_results/reviewer_risk_register.md
  Marks the two old "high"-severity risks (sequentiality / reuse
  decomposition) as resolved; adds new entries for the APC
  contamination empirics, the b3_analyze.sh truncate-write bug that
  cost unified's interference index, the GPU-0 EngineCore ghost
  cleanup, the saturated-replay caveat for trace-timestamp dispatch,
  and the synthetic B2 decode workload.

current_results/all_figures_index.md
  Adds the 8 new Window 1 figures alongside the existing 6 from the
  legacy summarize_runs run.

current_results/reproduction_commands.sh
  Records the full B3 + B2 + figure pipeline.

analysis/characterization_todo_for_interns.md
  Updates the Progress Snapshot table: B0, B1, B2, B3, B6 all DONE;
  only B4 and B5 remain (Window 2).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 23:25:27 +08:00

63 lines
2.5 KiB
Bash
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env bash
set -euo pipefail
# Window 0 audit refresh (legacy run summaries).
python3 analysis/characterization/summarize_runs.py \
--output-dir analysis/characterization/current_results \
--runs outputs/gpu_ab_combined outputs/gpu_ab_pdsep \
outputs/contention_16s_ts10 outputs/contention_16s_elastic \
outputs/combined_1000req outputs/exp3_pd_sep_tp1_mooncake
# B1' Per-request KV footprint on the full trace (runs on dash0 directly,
# CPU-only; the formatted full trace is hundreds of GiB).
python3 analysis/characterization/analyze.py \
--trace ~/ali-trace/trace-glm5.1-formatted/051315-051317.jsonl \
--kv-bytes-per-token 98304 \
--task-name full_trace_with_kv \
--output-root outputs/characterization \
--overwrite
# w600 trace APC theoretical bound.
python3 scripts/compute_apc_upper_bound.py \
--trace traces/w600_r0.0015_st30.jsonl \
--out outputs/apc_upper_w600.json
# B3 5-policy routing sweep on dash0 (8 × TP1 instances).
# First three policies share one vLLM lifecycle (hot-cache, fast):
bash scripts/b3_sweep.sh # writes outputs/b3_sweep_<TS>/
# Last two run isolated with cold vLLM:
bash scripts/b3_isolated_policy.sh unified \
traces/w600_r0.0015_st30.jsonl \
outputs/b3_sweep_<TS>/unified
python3 scripts/build_capped_trace.py \
--input traces/w600_r0.0015_st30.jsonl \
--output outputs/b3_sweep_<TS>/capped/trace.jsonl \
--max-turns 8
bash scripts/b3_isolated_policy.sh lmetric \
outputs/b3_sweep_<TS>/capped/trace.jsonl \
outputs/b3_sweep_<TS>/capped
# B3 analysis (joined records + indices) and report.
bash scripts/b3_analyze.sh outputs/b3_sweep_<TS>
python3 scripts/render_b3_report.py --sweep-dir outputs/b3_sweep_<TS>
# B2 PD-colo interference microbench. Launch 2 vLLM instances on
# ports 8100 and 8101 with --enable-prompt-tokens-details first, then:
python3 scripts/b2_interference.py \
--decode-endpoint http://127.0.0.1:8100 \
--prefill-endpoint http://127.0.0.1:8101 \
--model <model-path> \
--out-dir outputs/b2_microbench/sweep \
--prefill-sizes 2048,8192,16384,32768,65536 \
--variants different,same
python3 analysis/characterization/b2_sweep_analysis.py \
--sweep-dir outputs/b2_microbench/sweep
# Window 1 figure rendering (CPU only).
python3 analysis/characterization/render_window1_figures.py \
--results-dir analysis/characterization/window_1_results \
--out-dir analysis/characterization/window_1_results/figures