Gemini 2.5 achieved a score of 24.4% on the MathArena USAMO (United States of America Mathematical Olympiad) evaluation, surpassing previous top-performing models. This score reflects its advanced reasoning and generalization capabilities in solving complex mathematical problems.
MathArena is a rigorous platform designed to evaluate large language models (LLMs) on the latest math competitions and olympiads. It ensures fair assessment by testing models only on competitions that occurred after their release, avoiding contamination from pre-trained data. The platform publishes detailed leaderboards and open-sources its evaluation code to maintain transparency and comparability of model performances.
For more details, visit the MathArena website or the discussion on Hacker News.