Datagrom AI News Logo

OpenAI’s o3 AI model scores lower on a benchmark than the company initially implied

OpenAI’s o3 AI model scores lower on a benchmark than the company initially implied

April 20, 2025: OpenAIs o3 Benchmark Scores Raise Transparency Concerns - OpenAI's o3 AI model, initially claimed to outperform rivals on FrontierMath, scores significantly lower in third-party tests. While OpenAI suggested o3 could solve over 25% of challenges, independent tests show only a 10% success rate. Discrepancies arise from differences in computing power and test settings, with public o3 versions optimized for efficiency rather than peak performance.

Despite initial claims, other OpenAI models surpass o3, highlighting the complexities and frequent controversies in AI benchmarking practices. Companies vie for attention in a competitive market, underscoring the challenges in comparing AI model performances.

Link to article Share on LinkedIn

Stay Current on AI in Minutes Weekly

Cut through the AI noise - Get only the top stories and insights curated by experts.

One concise email per week. Unsubscribe anytime.