Why do AIs sometimes make up information (hallucinations)?

The analogy

Think about when you recount a dream right after waking up. You fill the gaps in your memory without realizing it with details that fit the story, and you tell them with the same naturalness as the parts you actually remember. You're not lying: your brain can't stand the gaps in the narrative and plugs them with something coherent.

The AI does the same thing, always. Its trade is to complete, not to check. Faced with a question whose answer it "doesn't know", it doesn't leave the gap: it puts the most probable word, then the next, until the sentence sounds right. The result is fluent and confident, and precisely for this reason it deceives: the confidence with which it tells you has nothing to do with how true it is.

How it really works

The AI doesn't consult an archive of facts when it answers you. It generates text by predicting, word after word, what is statistically likely to come next, based on the enormous amounts of text it was trained on. There's no moment when it checks "is this true?": it doesn't have the concept of true and false, it has the concept of probable.

On a common fact repeated everywhere (the capital of France) the probable word almost always coincides with the right one. On a rare, recent or very specific fact (the exact date of a niche event, the precise title of a study, an article of law) the web of probability is sparse: there the AI fills with something plausible, and plausible doesn't mean true. Invented quotations are the textbook case: the form of a quotation is very easy to imitate, the exact content is not.

What you can do in practice

You can't switch off hallucinations, but you can make them harmless:

Ask for the sources and actually open them. A link that won't open or doesn't say what the AI claims is a warning sign.
Treat every specific detail as "to be verified": numbers, dates, proper names, quotations, legal references. These are the points where it invents most.
Turn on web search when you need up-to-date facts: anchoring the answer to real pages greatly reduces the invention.
Ask it to state its uncertainty: "if you're not sure, say so instead of guessing". It doesn't make it infallible, but it slows it down.
On decisions that weigh (health, money, law) use it to understand and to ask yourself the questions, never as the only source.

A common misconception

People think hallucinations are a minor flaw, one that the more powerful models will make disappear entirely. That's not so. Inventing isn't an occasional malfunction: it's the flip side of the very mechanism that makes the AI useful, namely predicting plausible text. The better models hallucinate less and on harder things, but they don't zero out the phenomenon, because they keep doing the same thing: completing, not verifying. Expecting an AI that never makes a mistake means misunderstanding what it is.

Frequently asked questions

Does it happen with web search on too?

Less, but yes. Anchoring to real pages reduces invention, but the AI can still misread a source, cite it out of context or mix a true fact with an invented one. Web search lowers the risk, it doesn't eliminate it.

Is it my fault if it hallucinates?

No, the mechanism is its own. But a vague question or one that takes a false premise for granted increases the invention: if you ask "what are the three studies that prove X", the AI tends to provide them even if they don't exist, because you asked it for them. Honest, open questions help.

Do the paid models hallucinate less?

In general yes, they make fewer mistakes and on harder issues. But they're not immune: even the most advanced model, on a rare or recent detail, fills the gap with something plausible. The price changes the frequency, not the nature of the phenomenon.

The more confident the AI seems, the more likely it is to be right?

No, and it's the misconception that does the most damage. Confidence is a feature of the style, not of the substance: the AI writes a false thing with the same fluent, decisive tone it uses to write a true thing, because in both cases it's only completing the most probable sentence. Don't read confidence as reliability. The only way to know whether it's right is still to check.

Quick answer