ChatGPT-4.5 "hallucinates" more than a third of the time
If your partner or friend started making up facts every time you asked a question, it would undoubtedly make communication very difficult. But it seems that OpenAI has a different perception of this behavior. In its own release, citing its internal veracity assessment system SimpleQA, the company admitted that its new GPT-4.5 large language model "hallucinates" — that is, confidently produces fictions disguised as real facts — in 37% of cases.
Futurism writes about it.
ChatGPT disguises false facts for every third request
The latest AI model from the company valued at hundreds of billions of dollars gives false answers more than once in three attempts.
But, strangely enough, OpenAI tries to present this problem with "perverted" inventions in a positive light. Their logic is that GPT-4.5 seems to lie less often than previous versions of the same LLMs.
The graph shows that GPT-4, a model with claimed "advanced" logical reasoning capabilities, "hallucinates" in 61.8% of cases on the same SimpleQA benchmark. A simplified and cheaper version of the same system called o3-mini produces fictions in 80.3% of answers.
However, such errors are not unique to OpenAI’s technology.
"At present, even the best models can generate hallucination-free text only about 35 percent of the time," explained Wenting Zhao, the PhD student at Cornell University who co-authored a study on the rates of "hallucinations" in AI last year.
In an interview with TechCrunch, she noted that we cannot yet fully trust what these models generate.
Leaving aside the huge amounts of investment in projects that sometimes have problems with veracity, this still says a lot about the state of the entire AI industry. Technologies that require enormous resources are presented as a step towards "human-like intelligence" but are unable to correctly answer even basic queries.
It seems that OpenAI’s LLM models are gradually losing momentum, and the company is desperately looking for a new way to maintain the level of excitement it had after the launch of ChatGPT.
As a reminder, OpenAI has introduced the new GPT-4.5 Artificial Intelligence model. Compared to previous versions, this model was trained on a larger amount of data.
We also wrote that OpenAI updated the ChatGPT application for iOS. Now the chatbot can be used as the main search engine on Apple devices by activating a special extension in the Safari browser.