The initial [11, 23, 35] (equally-spaced) guess was wrong — EAGLE3 heads
are trained against specific target layer indices, and using different
ones at inference gives wrong outputs. Correct values come from vLLM
speculators' training config for Qwen3-8B:
https://github.com/vllm-project/speculators/blob/main/examples/train/
dflash_qwen3_8b_sharegpt_online_5k.sh
which pins target_layer_ids to "2 18 33". Re-running check-eagle3 with
the fix produces coherent top-5 for "The capital of France is":
Old ([11,23,35]): "," / " Paris" / " Madrid" / "." / " Berlin"
New ([2,18,33]): " Paris" / " Tokyo" / " Madrid" / "," / "."
Top-1 still differs from target's next token, but that's because EAGLE
compares (state_that_produced_prev, prev_token) → next, and the exact
pairing convention may need one more offset check when integrated into
the full speculative loop.