Metas Maverick AI Underperforms on Benchmark

April 11, 2025: Metas Maverick AI Underperforms on Benchmark - Meta's unmodified Llama-4-Maverick model ranks below competitors on the LM Arena benchmark after a controversy involving the use of an optimized, experimental version to achieve high scores. The basic Maverick, Llama-4-Maverick-17B-128E-Instruct, trails behind models like OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet.

Meta acknowledges experimenting with chat-optimized variants for the benchmark but emphasizes its commitment to open-source collaboration for future model improvements.

AI TECHNOLOGY PERFORMANCE EVALUATION

Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark

AI TECHNOLOGY PERFORMANCE EVALUATION

Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark

Stay Current on AI in Minutes Weekly