4bf0b999ff8db502965276e353a7c2b3852a2bd7
Complete 200-req comparison with GPU monitoring: Config TTFT50 TPOT90 E2E50 GPU% Active APC Combined (old cache-aware) 1.012 0.073 5.101 30.5% 64% 44.7% Combined (hybrid routing) 1.064 0.072 5.131 27.7% 60% 49.4% PD-Sep 4P+4D 1.994 0.075 7.112 12.4% 24% 40.2% PD-Sep 6P+2D 1.481 0.077 5.949 16.9% 28% ~37% Hybrid routing: +4.7pp APC with comparable latency and GPU utilization. PD-Sep: significantly worse on all dimensions for single-machine agentic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Description
No description provided
Languages
Python
82.9%
Shell
17.1%