Georg's Blog

Technology, leadership, and the digital frontier

Georg Zoeller
on Arxiv

Prompt Repetition Improves Non-Reasoning LLMs

Super simple premise: If you repeat your prompt, non-reasoning LLMs performance can improve dramatically.

It's useful to know, but also worthwhile to keep in mind that model developers may choose to integrate the behavior on their back-end to take full advantage of benchmaxxing those benchmarks. The half-life of such prompt engineering tricks is usually fairly limited as a result and their benefit tends to expire or turn negative after a while.

Due to caching, the additional input tokens tend to be extremely cheap.

Prompt Repetition Improves Non-Reasoning LLMs

When not using reasoning, repeating the input prompt improves performance for popular models (Gemini, GPT, Claude, and Deepseek) without increasing the number of generated tokens or latency.

arxiv.org