1 post found
Transformers have no inherent notion of order. RoPE encodes relative positions via rotation matrices. Here's the full math from sinusoidal to NTK-aware scaling.