How to read your AI
A confident bluffer is more dangerous than an obvious liar. Three concrete grading moves you can run on any AI output in under two minutes — before you ship it to a client, a regulator, or yourself.
An AI gave you something. A summary, an email, a chunk of code, a research brief. It looks clean. Maybe even elegant. The temptation is to act on it.
Don’t. Not yet.
The thing about modern language models is they speak the same language whether they’re right or wrong. Confidence is a vibe, not a signal. The same paragraph that nailed a tricky derivation can, a sentence later, invent a citation that doesn’t exist — in the same calm voice. That’s not a flaw to be patched out next quarter; it’s the medium itself. If you’re going to use these tools, the work of reading the output is part of using them. It’s not optional.
Here’s the operating principle:
You don’t use what you can’t see into. The glass-box rule. If the output can’t be inspected — if you can’t check it without re-reading it through your own brain — it isn’t ready to ship.
Two minutes. Three moves. You’ll catch most of the trouble.
The three moves
01Independent re-derivation of one claim
Not all of the claims — one. The one a real reader would push back on. If the AI says “Sennrich et al. introduced BPE to NLP in 2016”, open a tab, search the title, read the abstract. If the AI says “the law in Texas requires X”, open the statute. If it says “the patient’s LDL was 132”, open the chart.
If the one you randomly tested holds, the rest probably does too. If it doesn’t — if the citation is fabricated, the statute is misnamed, the number is wrong — treat the whole output as untrusted and start over. One bad claim in confident prose is a signal about the whole pour, not a stray.
02Read the code, not just the comment
AI-written code comes with a story: “This function does X by doing Y.” The story is often correct and pleasant. It is also generated separately from the code itself. The two can drift. The story can describe behavior the code doesn’t implement, or paper over a subtle bug the code has.
The move is simple: cover the comments and explanations and read what the code actually does, line by line. Trace one input through it by hand. If it does what the story says, fine. If it doesn’t, you’ve just caught the drift — before production caught it for you.
03Check the meta-claims about what the AI did
“I ran the tests and they pass.” “I searched the database and found three matches.” “I read the file and updated section 3.” These are claims about actions — and an AI describing actions is just as prone to confident error as an AI describing facts. Maybe more.
If the AI says it ran tests, look for the test output. If it says it searched, look at the search results. If it says it edited a file, open the file and look at the diff. The minute and a half this costs is the cheapest insurance you’ll buy that day.
The honest limit
These three moves do not catch everything. They catch most of the trouble — the confident hallucination, the drifted comment, the falsely reported action — in under two minutes per artifact, every time. That’s the trade. If the work matters more than two minutes’ worth of trouble, the moves pay for themselves immediately. If it doesn’t matter that much, you’re probably not reading this dispatch.
The point isn’t suspicion. The point is partnership. AI is a tool that’s vastly more useful when you can see into it — and a tool you should not use, on anything that matters, until you’ve built the habit of looking. The work of looking is the work. There’s no shortcut, but there’s also no excuse for not doing it.
If you’re using AI on anything regulated, anything published, anything that costs someone money or reputation, you don’t need a vendor; you need a discipline. The three moves above are the floor. Everything else is upholstery.