Sentiment Analysis Example

Input: "The product is great!" Label: Positive
Input: "Terrible, totally disappointed." Label: Negative
Input: "It's okay, nothing special." Label: Neutral

Try Other Tasks

How ICL Works

1
Pattern Recognition: The model recognizes the format: "Input → Label". It looks for recurring patterns in the sequence and applies them to new inputs.
2
Induction Heads Circuit: Research shows that special attention heads (Induction Heads) implement this mechanism: They copy the next token based on repetition of previous patterns.
3
Non-parametric Learning: Unlike traditional machine learning, the model is not retrained. Instead, it uses the context window (up to 128K!) to "program" new tasks.
4
Min et al. Discovery (2022): "Demonstrations are even more important than what is shown." Format and structure play a larger role than correct labels. The model learns mainly from the format.
5
Best Practices: Use XML/Markdown tags to provide structure (<text>Example</text> helps more than plain text). Relevant examples are important. More than 5-10 examples usually brings no further improvement.
6
Practical Limits: Large models (100B+) show strong ICL. Small models (7B-13B) show weak ICL. This is a form of "Emergence": The ability only appears at a certain model size.