Prompt Repetition Improves Non-Reasoning LLMs

Super simple premise: If you repeat your prompt, non-reasoning LLMs performance can improve dramatically.

It's useful to know, but also worthwhile to keep in mind that model developers may choose to integrate the behavior on their back-end to take full advantage of benchmaxxing those benchmarks. The half-life of such prompt engineering tricks is usually fairly limited as a result and their benefit tends to expire or turn negative after a while.

Due to caching, the additional input tokens tend to be extremely cheap.