In-Context Learning: How a few examples steer model behavior
Few-shot is the practical application of ICL. Complements the theoretical ICL demo with experimental insights on example selection and quantity.
GPT-3's paper was titled "Language Models are Few-Shot Learners" – that was the core discovery. Knowing how many examples are optimal saves tokens and improves results.
Few-Shot Learning means the model learns to recognize a pattern through a few input-output examples in the prompt and applies it to new inputs – without updating any parameters.
Attention recognizes the structure of examples and applies it to new inputs.
The structure of examples is more important than content correctness (Min et al.).
Performance rises quickly with 1-5 examples, then plateaus.
Comparison: Same format with correct labels vs. random labels
Observation: Accuracy rises steeply from 0-Shot → 1-Shot → 5-Shot, then plateaus. After ~8-10 examples, each additional example brings little improvement (Diminishing Returns).
All examples must have the same format (XML tags, JSON, Markdown).
Examples should cover the variety of expected inputs.
Use XML/JSON for clear demarcation of input and output.
Start with 1-3 examples, test up to max 10.
While format is more important, labels should still be correct.
Place high-quality examples preferably at the beginning.