Few-shot examples: choose them like you choose unit tests
- Sam Wilson
- Prompting , Best Practices
- 02 May, 2026
A prompt with five well-chosen examples beats the same model with fifty mediocre ones, almost every time. The mistake most teams make is treating examples like decoration — a few obvious cases pasted at the top of the prompt. Examples are the closest thing you have to test cases, and they should be curated with the same care.
What a good example actually does
It pins down a decision the model would otherwise hedge on. If your task involves rare-but-important categories, every example you skip is a category the model will silently merge into something more common. If the task is style-sensitive, every example sets the rhythm of the answer. The model is doing pattern-matching on your examples; if your examples don’t capture the patterns you care about, you’re picking your own bias against yourself.
Choosing them like unit tests
Cover edge cases first, not happy paths. The cases your examples don’t cover are the ones the model will guess on. Track example performance on a holdout set — when an example stops paying its weight in tokens, replace it. Treat the example pool as a living dataset, not a static prompt fragment.
The teams that reliably ship few-shot prompts have an example-curation pipeline. The teams that struggle treat examples as folklore.