scale=0.2 made TP1 uniformly infeasible (no baseline); bound decode to 128 tokens and
use mild 2x compression so TP1 registers a real, fast baseline, with 6 probes to span
TP1's low and TP4's high feasibility boundaries. Both configs identical except use_harness.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>