Extends exp(c) (dispatch ablation, 1 round-robin policy) to the full 5-policy
routing comparison, both modes on the SAME ttp trace (807 reqs, fresh vLLM/arm,
dash0 8xH20). Confirms exp(c)'s prediction and finds something stronger: the
dispatch mode FLIPS which policy wins.
- thinktime helps every policy but helps LPWL most (TTFT p90 -40%, E2E mean -31%
vs -3..-16% for the rest): tracets bursts punish prefill-spreading.
- Ranking flip: tracets -> LPWL only ties unified_ab on TTFT p90 and is 3rd on
E2E mean; thinktime -> LPWL is 1st on both (TTFT p90 -31%, best TPOT/balance,
zero knobs) vs the tuned unified+A+B.
- => benchmark agentic routing with thinktime; tracets' burst artifact erases
LPWL's advantage. Caveat n=1: tracets ranking is run-sensitive (does not
reproduce dash1 lpwl_5policy_600s.md), the thinktime advantage is the robust
signal (appears in both environments).
README + grouped-bar fig (figs/exp_d_policy_dispatch.png) + bench_report
summaries in results/.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>