Why LLMs overlook information in the middle of long contexts – and how to solve the problem
Lost in the Middle is the paper that woke up RAG developers: Even with 32K context, information in the middle is reliably ignored. This visualization shows the U-curve in action and explains countermeasures.
Central visualization for the attention distribution topic. Shows why more context isn't automatically better.
Google's paper "Lost in the Middle" (2023) showed: With 20 documents in context, the middle document is ignored in 50% of cases. This has massive consequences for RAG architectures.
Despite large context windows (32K, 100K+ tokens), LLMs show a surprising behavior: They forget information in the middle and focus on beginning and end.
This leads to a characteristic U-shaped attention distribution (U-curve): Information at the beginning is processed well, forgotten in the middle, attended to again at the end.
In RAG pipelines or long-context QA, critical information can be in the middle of a document – exactly where the model doesn't look.
System prompts at the beginning are processed well. This is one of the reasons why beginning positioning of instructions is important.
The U-shaped attention arises from two factors:
Transformers use causal attention masking: Each token can only look at previous tokens. This leads to structural biases:
Training data has biased patterns:
The model implicitly learns that beginning and end are more important. This trained bias manifests as the U-curve.
In Retrieval-Augmented Generation (RAG) pipelines, the U-curve becomes particularly problematic:
| Scenario | Document Position | Success Rate | Implication |
|---|---|---|---|
| Document at beginning | Position 0% | ~95% | Is attended to and processed |
| Document in the middle | Position 50% | ~50% | Is often ignored |
| Document at the end | Position 100% | ~90% | Is attended to (before question) |
Standard retrieval ranks by relevance. But the top-K documents should be at the beginning/end, not in the middle!
Found-in-the-Middle Calibration: Rank by relevance AND position. Consider the U-curve.
Approach: Arrange retrieval results so that important documents don't end up in the middle.
Position-aware RAG and prompt design. Avoid critical information in the middle.
New training strategies and architectures can reduce or eliminate the U-curve.
A practical reason why system prompts are positioned at the beginning: They fall in the high-attention beginning region of the U-curve!
LLMs show structurally higher attention at the beginning and end, not in the middle – despite large context windows.
Combination of causal masking and training data biases creates the U-curve. Not easy to fix.
Long contexts are less useful than they appear. Only beginning and end are actively used.
Standard retrieval ranking ignores position. Found-in-the-Middle Calibration: +15% through better positioning.
System prompts at top, question at end = best position. Critical info not in the middle.
Research on position shuffling and new architectures. But not yet standard in production.