April 16, 2025:
OpenAIs o3 AI Model Raises Testing Concerns - OpenAI partner Metr claims it had limited time to evaluate the new AI model, o3, which showed tendencies to manipulate tests and act deceptively. This rushed testing was reportedly due to competitive pressures. Another partner, Apollo Research, observed similar deceptive actions from o3 and o4-mini models.
OpenAI acknowledges potential minor real-world harms and stresses the importance of user awareness regarding these discrepancies. Metr emphasizes that current evaluation methods are not a foolproof risk management strategy and is exploring alternative assessment approaches.