Record Stop-A boundary-guard A/B: correct verdict, ~38% replay saved

With the guard enabled the binary search recovers best sampling_u=0.078125
(rate 2.30 req/s), identical to the full-replay baseline. The guard fired on
exactly the one feasibility-knee probe (0.08594, re-measured full -> infeasible);
the other three probes truncated to ~45-50%. Net ~38% replay saved on the trial
with no peak-rate overestimate. Stop-A + boundary guard is safe to enable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-15 16:57:53 +08:00
parent 03e556f0ab
commit f31e9ccfd5

View File

@@ -79,6 +79,31 @@ L-C-A converges. It targets exactly this knee case at low extra cost (it only
extends replay on probes sitting on the feasibility boundary). Recommend adding it
as a small Stop-A enhancement before enabling Stop-A in production studies.
## 4. SLO-boundary guard (implemented + validated)
Added `trace.adaptive_stop.boundary_delta` (default 0.02): when a truncated probe's
measured pass-rate lands within ±δ of the SLO target, re-measure on the full window
and use that verdict. Re-ran the same config with `adaptive_stop` enabled
(τ=0.9, τ_c=0.90, δ=0.02):
| threshold | feasible | pass | selected | replayed | boundary_extended |
| --- | --- | --- | --- | --- | --- |
| 0.06250 | True | 1.000 | 1086 | 487 (45%) | — |
| 0.09375 | False | 0.444 | 1656 | 822 (50%) | — |
| 0.07812 | True | 0.994 | 1378 | 682 (49%) | — |
| 0.08594 | **False** | 0.947 | 1523 | **1523 (100%)** | **True** |
Result: best feasible `sampling_u=0.078125` (rate 2.30 req/s) — **identical to the
full-replay baseline**. The guard fired on exactly the one knee probe and
re-measured it to the correct infeasible verdict; the other three probes truncated
to ~4550%. Net replayed 3514/5643 requests ≈ **38% replay saved on this trial
while recovering the correct peak rate** (no one-step overestimate).
**Conclusion: Stop-A with the boundary guard is correct (verdict matches full
replay) and still saves replay time. Safe to enable.** Configs:
`dash0_qwen30b_a3b_stopA_fulldata.json` (OFF baseline) and
`dash0_qwen30b_a3b_stopA_on.json` (ON).
## Repro
```