OpenAI and AI Hallucination Rates
  • OpenAI’s newest reasoning models, including o3 and o4-mini, are showing higher rates of hallucinations (incorrect or made-up information) compared to previous systems.
  • For example, o3 hallucinated 33% of the time on a benchmark test, more than double the previous model (o1), while o4-mini reached 48%. On a more general test, hallucination rates were even higher, with o4-mini reaching 79%.
  • OpenAI acknowledges the need for further research to understand and address these issues. Similar trends are being observed in other companies’ models, such as Google and DeepSeek.
  • https://www.nytimes.com/2025/05/05/technology/ai-hallucinations-chatgpt-google.html
7
16 comments
Pierre-Henry Isidor
7
OpenAI and AI Hallucination Rates
Data Alchemy
skool.com/data-alchemy
Your Community to Master the Fundamentals of Working with Data and AI — by Datalumina®
Leaderboard (30-day)
Powered by