Hallucinations arent Real

0001-01-01

In the early days of LLMs, "hallucinations" used to be commonplace. Models would try to answer prompts with something entirely made-up, like recommending non-existent libraries, or making up quotes that don’t exist. (fun example). This is a product of how LLM’s work - [[[They Tell You What You Want To Hear]]]().

However, in large public-facing chatbots today, hallucinations are essentially solved. Partly because of reasoning, partly better training, partly due to tool use, and partly because every answer is analyzed by several other agents focused on eliminating errors. While [[[Models Have Limits]]](), hallucinations are not inherent to them.

The Actual Problem: Inadequate Subtext

When you ask a human to troubleshoot a problem in a complex system, they first need to understand all the factors that could lead to the bug. For instance, you might have an application that fails to start because of a missing config value, which comes from env var, which is injected by Helm. But the helm chart never references it.

An LLM (or junior engineer) might simply say, well, add the reference. But with more experience, the right question is "how could this have ever worked for anyone?" An experienced engineer knows that when you ask that question it means you’re very, very close to a breakthrough and complete understanding. You know enough about the system to know what you don’t know. The most exciting phrase isn’t "Eureka!", it’s "that’s weird"

This sort of thing is currently the biggest pitfall of agentic. Spring wiring, gradle plugins, inheritance in all forms, the state of objects in up/downstream systems, partial successes, comprehending parallelism – these things absolutely trash LLM agents. And to be fair, they’re huge and unnecessary obstacles to human understanding of the systems as well. Engineering benefits from straightforwardness. The less that [[The Practical Engineer]], or an LLM, has to understand about the broader system, the fast you can make changes.