The Fluency Trap Revisited
I’ve been at UNLEASH America in Las Vegas this week, sitting in sessions, talking to senior leaders in the corridors, listening to what people are actually saying about AI right now. Not what they say in press releases. What they say when they’re thinking out loud between panels.
The thing I keep hearing is some version of this: we’re past the early problems. The models are solid now. They reason through things. They catch their own mistakes.
Some months ago, I wrote about what I called the fluency trap, in a piece for the Learning Guild, alongside a companion article on the persistent error patterns that survive even as model capabilities advance. The argument then was straightforward: our cognitive architecture predisposes us to read articulate, confident linguistic output as a reliable signal of underlying competence, and LLMs exploit that predisposition with remarkable efficacy. I thought naming it might help. What I’m noticing at UNLEASH is that the belief has hardened rather than softened, and the evidence since has not been kind to it.
There is a heuristic in cognitive science sometimes called the fluency heuristic: our tendency to interpret articulate, confident linguistic output as a reliable signal of underlying competence. We read polish as evidence of the thinking behind it. This isn’t a flaw in human reasoning. It’s a feature that served us well for a very long time. Producing polished, well-structured prose used to be expensive. It required knowledge, time, and craft. The surface quality of writing was a reasonable proxy for the quality of thought behind it, because generating convincing surface quality without the substance was genuinely hard.
AI has changed that relationship. The heuristic hasn’t failed. Its conditions have.
The models now produce fluent, structured, confident prose at effectively zero marginal cost, regardless of whether the underlying output is accurate, complete, or reasoned. The signal remains. The cost of faking it has gone. What that means in practice is that the very quality of AI output, the thing that makes it useful and impressive, is also what makes it harder to scrutinise. The better it looks, the less it gets questioned. And the belief that the model checks itself is what converts that dynamic from a risk into an assumption: if the system flags its own errors, scrutiny isn’t just reduced, it’s replaced.
The research on self-correction tells a different story. A peer-reviewed survey across the self-correction literature, published in the Transactions of the Association for Computational Linguistics in 2024, found no evidence that LLMs can reliably self-correct without external feedback. The correction is generated by the same process that generated the error. The model performs confidence, then performs reflection. Both are predictions. There is no second system doing the checking. What looks like self-correction is, more precisely, a fluent description of self-correction.
Which brings me to a finding that I find genuinely unsettling, not because it is surprising, but because of where it lands.
In February 2026, researchers at the Icahn School of Medicine at Mount Sinai published a large study in The Lancet Digital Health. They tested leading language models against clinical scenarios containing deliberate misinformation: a fabricated recommendation embedded in a hospital discharge summary, a health myth framed in clinical language, a physician’s note with a single false detail. What they found was that the models accepted and propagated the false information between 32 and 46 per cent of the time when it was framed in confident, professional prose. For these systems, the style of writing frequently overrode the content. If it sounded like a doctor wrote it, the model treated it as valid, even when it contradicted established medical knowledge.
The same family of failure runs in both directions. We are susceptible to AI’s polish. AI, it turns out, is susceptible to ours. The mechanism in each case is the same: style functioning as a proxy for substance. That symmetry is worth sitting with, because it tells you something about where the problem actually lives. It isn’t in the technology. It isn’t in the people. It’s in the heuristic both are running, and the heuristic is old, deeply embedded, and for most of human history, correct.
Outside healthcare, the evidence is harder to miss and easier to avoid. A Bloomberg Law analysis published in January 2026 identified more than 550 documented court cases involving AI-hallucinated legal citations. The more revealing observation, though, wasn’t the number. It was a description of what the filings actually looked like: polished surface, real citations, confident language, no analytical reasoning present, no lawyer genuinely engaged with the record. One attorney put it with accidental precision: the chatbot sounded confident, and lawyers are used to trusting confident voices. The verification culture in law is about as strong as it gets. It wasn’t enough.
In late 2025, two government reports produced by Deloitte and delivered to two different countries contained fabricated academic citations and an invented quote from a federal judge. Combined value: over a million dollars. Neither report looked wrong. That was, in the end, the mechanism. They looked exactly like every credible report those governments had ever received.
I keep coming back to the conversations I’ve been having this week. Smart people. Experienced leaders. People who’ve read the headlines and still arrive at the belief that the outputs can now be trusted because the models reason, because they reflect, because they check. I don’t think that belief comes from naivety. I think it comes from the same place it always has: the output looks right. It reads as though someone who knew what they were doing produced it. And somewhere in the gap between that appearance and what’s actually behind it, the question that should have been asked doesn’t get asked.
That’s not a technology problem. That’s the oldest pattern in organisational life: we stop verifying what we trust, and we trust what looks the part. AI didn’t create that pattern. It just produces, at scale and at speed, exactly the surface that triggers it.
What I don’t yet know, sitting here at the end of a week of conversations about AI adoption and workforce readiness and governance frameworks, is how you rebuild a verification instinct that’s being systematically outpaced by the thing it’s supposed to verify. The models will keep getting better at producing outputs that look like the outputs that deserve to be trusted. The heuristic will keep firing. The gap between appearance and substance will keep being invisible until it isn’t.
The question the room at UNLEASH isn’t asking isn’t about the technology. It’s about what we’ve quietly handed over in exchange for the fluency, and whether we’ve noticed we’ve done it.