arxiv.org
Pretraining on the Test Set Is All You Need↗
One of the earlier papers that conclusively showed that AI benchmarks primarily measure memorisation/training data expansion, explaining benchmaxxing before it became super popular.

Technology, leadership, and the digital frontier