AI Hallucinations: We Trust Too Much and Never Verify

Introduction

60% of the sources in an Ernst & Young report turned out to be AI hallucinations. How often do we actually check the answers we get from AI?

Yesterday I was at a training session with some colleagues. The topic was artificial intelligence. Someone asked the speaker a question, and in his answer he touched on hallucinations: those moments when an AI model returns fabricated information with the exact same confidence it would use for a verified fact.

When he finished, I brought up a recent example. Ernst & Young.

The Ernst & Young case: AI hallucinations in an official report

In late 2025, EY Canada published a 44-page cybersecurity report on loyalty program fraud, titled Points of Attack: Uncovering Cyber Threats and Fraud in Loyalty Systems. Signed by two partners and a senior manager. On paper, a heavyweight piece of work.

In practice, a disaster.

GPTZero, a startup that detects AI-generated content, analyzed every single citation in the report. The result: 60% of the sources were fabricated. Links pointing to pages that didn't exist, articles attributed to publications that never ran them, McKinsey reports that were never written.

GPTZero coined a term for this: vibe citing, citing "by feel," letting the AI invent references that sound plausible.

One of the report's central claims, the estimate that between 30% and 50% of loyalty points are never redeemed, was attributed to a supposed Loyalty Economics Report by McKinsey from 2022. That report doesn't exist. GPTZero traced the number back to a low-quality blog that had likely generated it with AI in the first place. A hallucination citing another hallucination: the researchers called them secondhand hallucinations.

EY retracted the document and launched an internal review. But by then, the report had already been picked up by dozens of articles, blogs, and, closing the loop, AI-powered search engines that were citing it as an authoritative source.

Chris Stokel-Walker wrote an excellent piece on the case for Sherwood News.

It's not just EY: this is everyone's problem

After I shared the story, another colleague shifted the conversation. The point wasn't just EY. The point was: how many of us actually check what AI gives us back?

The discussion got heated. The underlying question was as simple as it was uncomfortable. We all do it. We ask, we read, we use. Without verifying.

Someone admitted they use ChatGPT to draft client emails without re-reading the output. Another discovered, purely by accident, that an internal document contained a nonexistent regulatory reference, AI-generated and copy-pasted as-is. Nobody had noticed for weeks. What struck me most was that no one in the room felt exempt. This wasn't someone else's problem.

The EY case isn't isolated either. AI hallucinations have hit other major organizations. Between 2025 and 2026: Deloitte had to revise a report produced for a Canadian provincial government after fabricated academic citations were discovered, and the law firm Sullivan & Cromwell submitted court filings with over 40 AI-generated errors. The pattern is always the same: AI-generated content, published under the brand of prestigious organizations, with no one having checked whether any of it was true.

AI hallucination cases in major organizations

But why does this happen? Why do competent professionals, in organizations that pour billions into AI, not click a link to see if it exists?

Why the brain doesn't catch AI hallucinations

It sounds like a simple question. It's not.

Cognitive psychology has a well-documented phenomenon for this: automation bias, the tendency to trust answers from an automated system even when your own expertise should tell you otherwise. A 2025 study published in Procedia Computer Science (Wingerter et al.) measured it directly: participants who received support from an AI with deliberately wrong answers performed worse than those who had no support at all. Having an AI that helps you badly is worse than having no help. And the participants' self-reported knowledge of AI didn't change anything: knowing that AI can be wrong doesn't stop you from trusting it anyway.

Sandra Wachter, professor at the Oxford Internet Institute, captured the mechanism well in an interview with MIT Technology Review: AI models aren't built for truth, they're built to be persuasive. The text they produce is so fluid and well-structured that the brain treats it as reliable without ever going through the verification step. And that fluidity is exactly what makes AI hallucinations so dangerous: they don't look like errors, they look like facts.

"I think we might just have to say goodbye to finding out about the truth in a quick way. If you want to find out what is true, it probably takes you more time now." — Sandra Wachter, Oxford Internet Institute

Verifying AI answers: the time nobody invests

When you think about it, the mechanism is pretty clear. AI was adopted en masse because it saves time. But the time saved in producing content rarely gets reinvested in verification. If anything, the implicit promise is to shorten the entire cycle, from question to finished document.

Generating is fast, smooth, satisfying. Verifying is slow, tedious, thankless. It often means discovering that something doesn't add up and having to redo part of the work. It's not hard to see which activity the brain prefers.

There's a practical take on this: the problem is in the process, and the fix is better verification systems, more structured workflows, fact-checking tools, mandatory human reviews. Then there's a more uncomfortable take: by constantly delegating content production and evaluation to AI, we're losing the ability to think critically about what we read in the first place.

A 2025 study by Gerlich, conducted with 666 participants across different age groups, found exactly this: a significant correlation between frequent use of AI tools and reduced critical thinking ability. The mechanism is cognitive offloading: the more you delegate reasoning to the machine, the less your brain trains itself to do it on its own.

Both are probably right, to different degrees. A professional who uses AI to produce a draft and then reviews it point by point is using the tool productively. One who copies a block of citations into a million-dollar report without clicking a single link is doing something else entirely. But in both cases, the responsibility for the final check is always human.

Taking a step back

This article isn't trying to offer solutions, and it doesn't presume to tell anyone how to work. But that discussion with colleagues yesterday got me thinking, and the value is right there.

Next time an AI gives you a perfectly structured answer, complete with data and sources and links, try clicking one of those links. Try looking up that number. Try checking whether that source actually exists.

It might all check out. Or you might discover you were about to build something on foundations that don't exist. And when in doubt, verifying costs a lot less than finding out later.

Did you like the article? If it helped you, consider buying me a coffee for support.

This website uses cookies to ensure you get the best experience. By continuing to browse, you accept the use of cookies. See our cookie policy.