Posts tagged with "NTK scaling"
2 posts found
Mar 31, 2026 RoPE rotary position embeddings mathematical derivation NTK scaling YaRN position encoding math
Rotary Position Embeddings: The Full Mathematical Derivation
RoPE is used by virtually every modern LLM. Here's the complete derivation from first principles, proof of the relative position property, and NTK-aware scaling.