Use tokenizer.is_eos() (multi-eos) for generation termination in both PP and TP engines instead of a single eos id, so gpt-oss stops on <|return|> /<|call|>/<|endoftext|>. In the TP engine, optionally apply a repetition penalty on the greedy decode path (XSERV_REP_PENALTY>1 over XSERV_REP_WINDOW recent tokens; off by default) to break greedy repetition loops. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>