narrow(dim, start, len) creates a zero-copy slice along any dimension. is_contiguous() now ignores stride mismatches on dimensions of size 1, since those dimensions are never stepped. This avoids unnecessary GPU strided copies when slicing fused projection outputs at batch=1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>