The B3 audit flagged the trace replayer's "fire turn N+1 immediately
if turn N is behind schedule" semantics as a potential benchmark
crime, because under saturation the effective arrival process becomes
policy-dependent (slow policy -> longer session lifetimes -> more
concurrent in-flight -> harder system -> still slower). The audit
called this dispatch slip.
But in agentic workloads, turn N+1 is generated by a tool-call
response or an autonomous-loop step, not by a human reading the
previous reply. There is no inter-turn think-time. So the replayer's
"no think-time, sequential within session, fire-immediately-when-
ready" behavior is the correct model of agentic production, and the
feedback amplification is a real property of production systems
under saturation rather than an artifact of the replayer.
The note (analysis/characterization/agentic_dispatch_coupling.md)
lays out:
- The dispatch rule and the apparent feedback loop
- Why agentic workloads do not have user think-time
- Application of Little's Law: slower policy carries higher concurrent
in-flight load, so the policy x feedback gap is real, not artifact
- Reframes B3 as the "production-replay" experiment and B4 as the
orthogonal "controlled-load" experiment, complementary not
hierarchical
- Calls the feedback amplification itself out as a finding worth
reporting (e.g. unified's ~2x latency-p90 gap over lmetric in B3
reflects both the routing improvement and the in-flight reduction)
- Contrasts with chat workloads (human think-time partially breaks
the feedback loop, agentic removes that floor)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>