- REPORT.md: self-contained milestone report covering baseline vs elastic setup, exact launch commands, benchmark params, results, log locations, and repo structure — sufficient for anyone to reproduce - analysis/pd_separation_analysis.md §5: elastic P2P system-level breakdown (KV cache hit ratio, per-class TTFT, GPU util paradox explanation) - scripts/cache_aware_proxy.py: round-robin P-instance selection replacing argmin(ongoing_tokens) to fix GPU load imbalance (3.0x → expected ~2x) - scripts/launch_elastic_p2p.sh: one-command launch for elastic P2P config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
10 lines
125 B
Plaintext
10 lines
125 B
Plaintext
__pycache__/
|
|
*.pyc
|
|
.venv/
|
|
*.egg-info/
|
|
outputs/
|
|
traces/
|
|
*.log
|
|
.claude/
|
|
# third_party/vllm tracked in git for patch management
|