RoPE Rotation Animation

Visualization of how Rotary Position Embeddings (RoPE) encode relative positions through vector rotation

RoPE Rotation shows how position is encoded as an angle. Each 2D pair in the embedding is rotated by a position-dependent angle. The key insight: The dot product between two rotated vectors depends only on their relative position – perfect for Attention.

📖 Learning Context ▼

Understand the geometric intuition behind RoPE: Position = Rotation angle
Follow how 2D subspaces are rotated independently
Recognize why relative position emerges from rotation difference

Step 2/6 Optimizations & Memory

This animation makes the mathematical idea of RoPE tangible. It complements the theoretical RoPE-ALiBi overview with an interactive visualization.

RoPE is the position encoding in Llama, Mistral, Qwen and most open-source models. Understanding how it works explains why these models can be extended to 128K+ tokens.

Dimension Pairs: Embedding is split into d/2 2D pairs, each with its own base angle
Frequency Mix: Different pairs rotate at different frequencies (like Fourier)
Relative Position: q·k = f(q, k, relative_pos), not absolute position

What is RoPE?

Rotary Position Embedding (RoPE) rotates vectors in 2D subspaces based on their position. The key: Relative position between two tokens corresponds to the rotation difference of their embeddings. This enables zero-shot length extrapolation and is used in Llama, PaLM and GPT-NeoX.

RoPE Formula

RoPE(x, m) = [
cos(m·θ) · x₀ - sin(m·θ) · x₁,
sin(m·θ) · x₀ + cos(m·θ) · x₁
]

m: Position of the token
θ: Rotation frequency (e.g., 1/10000^2i/d)
x₀, x₁: 2D subspace of the vector

Why Does RoPE Work?

The dot product between Query at position m and Key at position n:

q_m · k_n = q · k · cos((m-n)·θ)

Depends only on the relative position (m-n), not on absolute positions! This enables length extrapolation: If the model was trained on 2K tokens, it often still works at 8K+ tokens.

RoPE Rotation Animation

Learning Objectives

Context: Where are we?

Why It Matters

Key Takeaways

What is RoPE?

Controls

Rotation Angles

RoPE Formula

Why Does RoPE Work?