Prompt Repetition Improves Non-Reasoning LLMs
Super simple premise: If you repeat your prompt, non-reasoning LLMs performance can improve dramatically.
It's useful to know, but also worthwhile to keep in mind that model developers may choose to integrate the behavior on their back-end to take full advantage of benchmaxxing those benchmarks. The half-life of such prompt engineering tricks is usually fairly limited as a result and their benefit tends to expire or turn negative after a while.
Due to caching, the additional input tokens tend to be extremely cheap.
Prompt Repetition Improves Non-Reasoning LLMs
When not using reasoning, repeating the input prompt improves performance for popular models (Gemini, GPT, Claude, and Deepseek) without increasing the number of generated tokens or latency.