How to Reduce AI Hallucinations in Code: The Context Quality Approach

How to Reduce AI Hallucinations in Code: The Context Quality Approach
Your AI assistant just generated a clean, well-structured function that imports from `@utils/validation`. The code looks right. The types check out. There's just one problem: `@utils/validation` doesn't exist in your codebase. Never has.
This is a code hallucination — and it's happening in 15-25% of AI-generated code suggestions according to studies tracking AI coding accuracy across real-world projects. The AI doesn't crash or throw an error. It confidently produces code that references functions, APIs, file paths, and patterns that are entirely invented.
For developers relying on AI to accelerate their workflow, hallucinations are the single biggest source of wasted time. Every hallucinated import that needs debugging, every invented API endpoint that needs replacing, every wrong function signature that causes a runtime error — these eat into the productivity gains that AI coding promised. And the fix isn't a better model. It's better context.
What Code Hallucinations Actually Are
A code hallucination occurs when an AI generates code that references something that doesn't exist in the target codebase. Unlike syntax errors or logical bugs, hallucinations *look* correct. They follow valid patterns, use reasonable naming conventions, and often pass a cursory review. The problem only surfaces when you try to run the code or when a teammate asks "where is this imported from?"
Hallucinations are distinct from regular bugs. A bug is wrong logic applied to real code. A hallucination is correct-looking logic applied to imaginary code. The distinction matters because they have different root causes and different solutions.
The most dangerous hallucinations are the subtle ones. An AI that invents a completely fake library is easy to catch. An AI that uses the right library but calls a function with the wrong signature — passing three arguments to a function that takes two, or using `v2` of an API when your project uses `v1` — is much harder to spot.
The Three Types of Code Hallucination
Non-Existent Imports and APIs
The most common type. The AI generates `import { validateEmail } from '@utils/validators'` when your project has no such module. Or it calls `response.sendJSON()` when your HTTP framework uses `response.json()`. Or it imports `lodash.deepClone` when the actual function is `lodash.cloneDeep`.
These hallucinations come from the model's training data. It has seen thousands of codebases with `@utils/validators` patterns, so it generates what's statistically likely rather than what's actually present. 40-50% of all code hallucinations fall into this category.
Wrong Function Signatures
The AI calls a real function with the wrong arguments. Your `createUser` function takes `(email, password, role)` but the AI generates a call with `(username, email, password)`. The function exists, the intent is correct, but the interface is wrong.
This type is particularly insidious because the function name is real — it shows up in search results, it exists in the codebase — but the AI confabulates the signature based on what "makes sense" rather than what's actually defined. 25-35% of code hallucinations are signature mismatches.
Invented File Paths and Architecture
The AI suggests editing `src/services/auth/tokenService.ts` when your auth logic lives in `lib/auth.ts`. Or it creates a new file following a `src/controllers/` convention when your project uses `src/routes/`. The architectural assumptions are plausible but fabricated.
This happens when the AI lacks visibility into your project structure. It fills the gap with the most common pattern from its training data — which might be a completely different architectural style from yours. 20-25% of code hallucinations involve invented paths or wrong structural assumptions.
Why Hallucinations Happen: The Knowledge Gap
Hallucinations aren't random. They follow a consistent pattern: the AI encounters a knowledge gap about your specific codebase and fills it with the most statistically probable answer from its training data.
When you ask the AI to "add input validation to the signup form," it needs to know: Where is the signup form? What validation library do you use? What's the existing validation pattern? What types are the form fields? How does error handling work?
If the AI can see your actual form component, your actual validation utility, and your actual error handling pattern, it generates code that uses them correctly. If it can't see these — if there's a knowledge gap — it invents plausible alternatives from training data.
The hallucination rate correlates directly with context quality. Developers who provide comprehensive, structured context report hallucination rates of 5-8%. Developers using default context (whatever the IDE automatically includes) report rates of 15-25%. Developers working on large codebases with minimal context see rates as high as 30-35%.
The relationship is almost linear: more knowledge gaps = more hallucinations. Close the gaps, and the hallucinations disappear.
Measuring Your Hallucination Rate
Before you can reduce hallucinations, you need to know how often they occur. Most developers underestimate their hallucination rate because they've internalized the correction process — they fix invented imports and wrong signatures automatically without consciously registering them as AI errors.
Track these metrics for one week:
- Non-existent import corrections: How many times did you fix an import that referenced a module or function that doesn't exist in your project?
- Signature corrections: How many times did you fix function calls where the arguments didn't match the actual function definition?
- Path corrections: How many times did you redirect the AI to the correct file or directory when it assumed the wrong location?
- Pattern corrections: How many times did you rewrite AI-generated code to match your project's actual patterns (different HTTP framework, different ORM syntax, different state management approach)?
Count each correction as one hallucination instance. Divide by the total number of AI suggestions you accepted or modified. That's your hallucination rate.
A rate above 15% means context quality is the bottleneck in your AI workflow. A rate above 25% means you're spending more time correcting hallucinations than you're saving with AI assistance.
How Dependency Graphs Prevent Hallucinations
A dependency graph is the definitive record of what exists in your codebase. Every symbol, every import, every function signature, every file path — all verified by static analysis against actual source code.
When AI has access to a dependency graph, knowledge gaps effectively vanish. The AI doesn't need to guess what your validation utility is called because the graph shows exactly which validation functions exist, where they're defined, what arguments they take, and what types they return.
Every Symbol Is Verified
A dependency graph built through static analysis only contains symbols that actually exist in the code. There's no `validateEmail` in the graph unless there's a `validateEmail` in the source. The AI can't hallucinate an import because every import in the context is verified to resolve to a real file.
This is fundamentally different from keyword search or embedding-based retrieval. Those approaches find files that are *similar* to the query, which might include patterns from other projects, deprecated code, or files that reference concepts without implementing them. A dependency graph shows exactly what *is*, not what *might be*.
Signatures Are Exact
When the dependency graph includes a function, it includes the actual signature — parameters, types, return values. The AI doesn't need to guess that `createUser` takes `(email, password, role)` because the graph provides the exact definition.
This eliminates signature hallucinations entirely. Studies show that developers using graph-based context see a 90%+ reduction in function signature errors compared to developers using keyword-based file search.
Architecture Is Explicit
The dependency graph captures the actual file structure and module relationships of your project. It knows that your auth logic lives in `lib/auth.ts`, not `src/services/auth/tokenService.ts`, because it indexes the real filesystem. The AI can't invent an architectural pattern that contradicts the graph.
How vexp Reduces Hallucinations
vexp builds a complete dependency graph of your codebase through static analysis. When an AI agent needs context for a task, vexp serves a context capsule containing only verified symbols, real file paths, actual function signatures, and true dependency relationships.
The mechanism is straightforward: vexp eliminates the knowledge gaps that cause hallucinations. Instead of the AI guessing what exists in your codebase, vexp provides the verified answer. Every import in the context resolves to a real module. Every function call matches a real signature. Every file path points to a real file.
Developers using vexp for context report hallucination rates dropping from the typical 15-25% range to 5-8% — a reduction that translates directly into less debugging time, fewer incorrect suggestions, and higher acceptance rates for AI-generated code.
The reduction is especially dramatic for large codebases. On projects with 100K+ lines of code, where the AI can only see a tiny fraction of the codebase at once, hallucination rates without structural context can reach 30-35%. With graph-based context, they drop to the same 5-8% range regardless of codebase size. The graph scales; guessing doesn't.
Practical Hallucination Reduction Beyond Tools
Context engines handle the structural side of hallucination prevention. But several practices reduce hallucinations regardless of your tooling:
Be Specific in Prompts
"Fix the auth bug" gives the AI maximum room to hallucinate. "Fix the JWT expiration check in `src/auth/verifyToken.ts` that returns true for expired tokens" gives the AI a specific target with a real file path. Specific prompts reduce hallucinations by 30-40% even without additional context tools.
Include Real Examples
When asking the AI to create something new, include an example of an existing similar component from your codebase. "Create a new API endpoint following the same pattern as `GET /api/users` in `src/routes/users.ts`" grounds the AI in your actual patterns instead of training data patterns.
Validate Imports Immediately
Before reviewing generated logic, check every import statement. If the AI hallucinated an import, the rest of the code is likely built on that hallucination. Catching fake imports first saves you from debugging downstream issues that trace back to a non-existent module.
Use Type Checking as a Hallucination Detector
TypeScript's compiler is an excellent hallucination detector. Run `tsc --noEmit` on AI-generated code immediately. Type errors often reveal hallucinated function signatures, wrong argument counts, and non-existent module references before they reach runtime.
Break Complex Tasks into Verifiable Steps
Instead of asking the AI to build a complete feature in one shot, break it into steps that you can verify independently. "First, show me the existing database schema for users" → verify the response → "Now add a `lastLogin` field to that schema" → verify → continue. Each verification step catches hallucinations before they compound.
The Context Quality Equation
Hallucinations in AI-generated code aren't a model problem that will be fixed by GPT-5 or Claude 5. They're a context problem that exists today because AI models fill knowledge gaps with statistical guesses.
The equation is simple: hallucination rate = f(knowledge gaps). Reduce the gaps, reduce the hallucinations. Eliminate the gaps, eliminate the hallucinations.
Dependency graphs eliminate structural knowledge gaps — what exists, what depends on what, what signatures look like, where files live. Specific prompts eliminate intent knowledge gaps. Real examples eliminate pattern knowledge gaps.
The developers who report the lowest hallucination rates aren't using better models. They're providing better context. The model's ceiling is set by the context it receives. Raise the context quality, and the ceiling rises with it.
Every hallucinated import you fix, every invented API you replace, every wrong signature you correct — that's time you could reclaim by investing in context quality. The math isn't subtle: a developer spending 15 minutes per day fixing hallucinations spends 5+ hours per month on corrections. At a $75/hour loaded cost, that's $375/month in wasted engineering time — more than the cost of the AI tool itself.
Fix the context. The hallucinations fix themselves.
Frequently Asked Questions
What percentage of AI-generated code contains hallucinations?
Are hallucinations different from regular coding bugs?
Will better AI models eliminate code hallucinations?
How can I quickly detect hallucinations in AI-generated code?
Does providing more files as context reduce hallucinations?
Nicola
Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.
Related Articles

Codex vs Claude: AI Coding Agents Compared 2026
Compare OpenAI Codex and Claude Code: cloud-sandboxed vs local-shell execution, security, token optimization, and which fits your workflow.

Claude vs Codex 2026: Which AI Coding Agent Wins?
Compare Claude Code vs OpenAI Codex for AI coding tasks. Local vs cloud execution, costs, security, and workflow fit explained.

Claude Code vs Codex: Which AI Coding Agent Wins in 2026?
Compare Claude Code vs Codex: benchmark scores, architecture, pricing, and which agentic coding tool fits your workflow best.