Resolves AUDIT_AND_ROADMAP §S6: the 785 lines of vendored
SGLang patch are a known reviewer trust risk because the
prototype touches scheduler.py / schedule_batch.py /
session_aware_cache.py / disaggregation hot paths. Without
classification readers cannot tell core mechanism from
temporary scaffold.
Classifies each of the 10 patched files into:
MUST-HAVE — Algorithm 1/2/3, streaming session
lifecycle, admit RPC. ~450 lines.
Long-term retention.
WORKAROUND — release_session token-free,
maybe_trim_decode_session_cache,
streaming-session extend_input_len
correction (incl. the E3 landmine
hotfix from commit 986f351),
DecodePreallocQueue trim trigger.
~150 lines. To DELETE entirely
after block-level eviction refactor
(BLOCK_LEVEL_EVICTION_DESIGN §3.7).
EXPERIMENTAL — backpressure pause hint
(_compute_backpressure_pause_hint).
~60 lines. Signal not closed-loop
per REAL_ALI §4.3; retain as hook
or retire in 1 month.
INSTRUMENTATION — _compute_pool_breakdown_for_diagnostics.
~50 lines. Keep behind a flag.
MINOR — ~3 lines. Ignore.
The §2 summary gives reviewers a one-glance picture of
what's core vs. scaffold. Maintenance convention in §3
mandates classifying every new (sglang) patch at commit
time.
§4 wires the classification into the roadmap: clearing
the WORKAROUND bucket is the objective completion marker
for block-level eviction refactor.
No code change.