Interaktives Tutorial

How modern Large Language Models work

From input to response, explore the architecture behind it GPT-4, Claude and Llama. Click on a building block to find out more.

📝
Input
Text / Prompt
Tokenizer
BPE / ~100k Tokens
🎯
Embeddings
d = 8,192 dim
🔮
Transformer Stack
x80 Blocks (Llama 3 70B)
Transformer-Block Struktur
Residual + RMSNorm
SwiGLU Feedforward
Residual + RMSNorm
📊
Output Layer
🎲
Sampling
Temperature / Top-p
📝 input
Chapter 1

What's happening here?

The user enters a text prompt. This can be a question, be an instruction or a context.

Find out more →

Technical details