agentic-kvc

Go to file

Gahow Wang 8ac41a8684 Agentic dispatch coupling: trace-replay session-sequentiality is realistic

The B3 audit flagged the trace replayer's "fire turn N+1 immediately
if turn N is behind schedule" semantics as a potential benchmark
crime, because under saturation the effective arrival process becomes
policy-dependent (slow policy -> longer session lifetimes -> more
concurrent in-flight -> harder system -> still slower). The audit
called this dispatch slip.

But in agentic workloads, turn N+1 is generated by a tool-call
response or an autonomous-loop step, not by a human reading the
previous reply. There is no inter-turn think-time. So the replayer's
"no think-time, sequential within session, fire-immediately-when-
ready" behavior is the correct model of agentic production, and the
feedback amplification is a real property of production systems
under saturation rather than an artifact of the replayer.

The note (analysis/characterization/agentic_dispatch_coupling.md)
lays out:
- The dispatch rule and the apparent feedback loop
- Why agentic workloads do not have user think-time
- Application of Little's Law: slower policy carries higher concurrent
  in-flight load, so the policy x feedback gap is real, not artifact
- Reframes B3 as the "production-replay" experiment and B4 as the
  orthogonal "controlled-load" experiment, complementary not
  hierarchical
- Calls the feedback amplification itself out as a finding worth
  reporting (e.g. unified's ~2x latency-p90 gap over lmetric in B3
  reflects both the routing improvement and the in-flight reduction)
- Contrasts with chat workloads (human think-time partially breaks
  the feedback loop, agentic removes that floor)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-26 01:00:25 +08:00