April 7, 2025:
Meta Refutes Claims of Llama 4 Benchmark Manipulation - Meta's VP of generative AI, Ahmad Al-Dahle, refuted claims that the company manipulated benchmark scores for its Llama 4 Maverick and Scout models by training them on test sets. The rumor surfaced from a Chinese social media post by a purported former employee, alleging Meta concealed the models' weaknesses.
Discrepancies in model performance reports and the use of an experimental Maverick version for benchmarks contributed to the speculation. Al-Dahle admitted to varied user experiences and pledged to resolve these issues as implementations stabilize.