How to Reduce AI Hallucinations in Code: The Context Quality Approach

Nicola·
How to Reduce AI Hallucinations in Code: The Context Quality Approach

How to Reduce AI Hallucinations in Code: The Context Quality Approach

Your AI assistant just generated a clean, well-structured function that imports from `@utils/validation`. The code looks right. The types check out. There's just one problem: `@utils/validation` doesn't exist in your codebase. Never has.

This is a code hallucination — and it's happening in 15-25% of AI-generated code suggestions according to studies tracking AI coding accuracy across real-world projects. The AI doesn't crash or throw an error. It confidently produces code that references functions, APIs, file paths, and patterns that are entirely invented.

For developers relying on AI to accelerate their workflow, hallucinations are the single biggest source of wasted time. Every hallucinated import that needs debugging, every invented API endpoint that needs replacing, every wrong function signature that causes a runtime error — these eat into the productivity gains that AI coding promised. And the fix isn't a better model. It's better context.

What Code Hallucinations Actually Are

A code hallucination occurs when an AI generates code that references something that doesn't exist in the target codebase. Unlike syntax errors or logical bugs, hallucinations *look* correct. They follow valid patterns, use reasonable naming conventions, and often pass a cursory review. The problem only surfaces when you try to run the code or when a teammate asks "where is this imported from?"

Hallucinations are distinct from regular bugs. A bug is wrong logic applied to real code. A hallucination is correct-looking logic applied to imaginary code. The distinction matters because they have different root causes and different solutions.

The most dangerous hallucinations are the subtle ones. An AI that invents a completely fake library is easy to catch. An AI that uses the right library but calls a function with the wrong signature — passing three arguments to a function that takes two, or using `v2` of an API when your project uses `v1` — is much harder to spot.

The Three Types of Code Hallucination

Non-Existent Imports and APIs

The most common type. The AI generates `import { validateEmail } from '@utils/validators'` when your project has no such module. Or it calls `response.sendJSON()` when your HTTP framework uses `response.json()`. Or it imports `lodash.deepClone` when the actual function is `lodash.cloneDeep`.

These hallucinations come from the model's training data. It has seen thousands of codebases with `@utils/validators` patterns, so it generates what's statistically likely rather than what's actually present. 40-50% of all code hallucinations fall into this category.

Wrong Function Signatures

The AI calls a real function with the wrong arguments. Your `createUser` function takes `(email, password, role)` but the AI generates a call with `(username, email, password)`. The function exists, the intent is correct, but the interface is wrong.

This type is particularly insidious because the function name is real — it shows up in search results, it exists in the codebase — but the AI confabulates the signature based on what "makes sense" rather than what's actually defined. 25-35% of code hallucinations are signature mismatches.

Invented File Paths and Architecture

The AI suggests editing `src/services/auth/tokenService.ts` when your auth logic lives in `lib/auth.ts`. Or it creates a new file following a `src/controllers/` convention when your project uses `src/routes/`. The architectural assumptions are plausible but fabricated.

This happens when the AI lacks visibility into your project structure. It fills the gap with the most common pattern from its training data — which might be a completely different architectural style from yours. 20-25% of code hallucinations involve invented paths or wrong structural assumptions.

Why Hallucinations Happen: The Knowledge Gap

Hallucinations aren't random. They follow a consistent pattern: the AI encounters a knowledge gap about your specific codebase and fills it with the most statistically probable answer from its training data.

When you ask the AI to "add input validation to the signup form," it needs to know: Where is the signup form? What validation library do you use? What's the existing validation pattern? What types are the form fields? How does error handling work?

If the AI can see your actual form component, your actual validation utility, and your actual error handling pattern, it generates code that uses them correctly. If it can't see these — if there's a knowledge gap — it invents plausible alternatives from training data.

The hallucination rate correlates directly with context quality. Developers who provide comprehensive, structured context report hallucination rates of 5-8%. Developers using default context (whatever the IDE automatically includes) report rates of 15-25%. Developers working on large codebases with minimal context see rates as high as 30-35%.

The relationship is almost linear: more knowledge gaps = more hallucinations. Close the gaps, and the hallucinations disappear.

Measuring Your Hallucination Rate

Before you can reduce hallucinations, you need to know how often they occur. Most developers underestimate their hallucination rate because they've internalized the correction process — they fix invented imports and wrong signatures automatically without consciously registering them as AI errors.

Track these metrics for one week:

  • Non-existent import corrections: How many times did you fix an import that referenced a module or function that doesn't exist in your project?
  • Signature corrections: How many times did you fix function calls where the arguments didn't match the actual function definition?
  • Path corrections: How many times did you redirect the AI to the correct file or directory when it assumed the wrong location?
  • Pattern corrections: How many times did you rewrite AI-generated code to match your project's actual patterns (different HTTP framework, different ORM syntax, different state management approach)?

Count each correction as one hallucination instance. Divide by the total number of AI suggestions you accepted or modified. That's your hallucination rate.

A rate above 15% means context quality is the bottleneck in your AI workflow. A rate above 25% means you're spending more time correcting hallucinations than you're saving with AI assistance.

How Dependency Graphs Prevent Hallucinations

A dependency graph is the definitive record of what exists in your codebase. Every symbol, every import, every function signature, every file path — all verified by static analysis against actual source code.

When AI has access to a dependency graph, knowledge gaps effectively vanish. The AI doesn't need to guess what your validation utility is called because the graph shows exactly which validation functions exist, where they're defined, what arguments they take, and what types they return.

Every Symbol Is Verified

A dependency graph built through static analysis only contains symbols that actually exist in the code. There's no `validateEmail` in the graph unless there's a `validateEmail` in the source. The AI can't hallucinate an import because every import in the context is verified to resolve to a real file.

This is fundamentally different from keyword search or embedding-based retrieval. Those approaches find files that are *similar* to the query, which might include patterns from other projects, deprecated code, or files that reference concepts without implementing them. A dependency graph shows exactly what *is*, not what *might be*.

Signatures Are Exact

When the dependency graph includes a function, it includes the actual signature — parameters, types, return values. The AI doesn't need to guess that `createUser` takes `(email, password, role)` because the graph provides the exact definition.

This eliminates signature hallucinations entirely. Studies show that developers using graph-based context see a 90%+ reduction in function signature errors compared to developers using keyword-based file search.

Architecture Is Explicit

The dependency graph captures the actual file structure and module relationships of your project. It knows that your auth logic lives in `lib/auth.ts`, not `src/services/auth/tokenService.ts`, because it indexes the real filesystem. The AI can't invent an architectural pattern that contradicts the graph.

How vexp Reduces Hallucinations

vexp builds a complete dependency graph of your codebase through static analysis. When an AI agent needs context for a task, vexp serves a context capsule containing only verified symbols, real file paths, actual function signatures, and true dependency relationships.

The mechanism is straightforward: vexp eliminates the knowledge gaps that cause hallucinations. Instead of the AI guessing what exists in your codebase, vexp provides the verified answer. Every import in the context resolves to a real module. Every function call matches a real signature. Every file path points to a real file.

Developers using vexp for context report hallucination rates dropping from the typical 15-25% range to 5-8% — a reduction that translates directly into less debugging time, fewer incorrect suggestions, and higher acceptance rates for AI-generated code.

The reduction is especially dramatic for large codebases. On projects with 100K+ lines of code, where the AI can only see a tiny fraction of the codebase at once, hallucination rates without structural context can reach 30-35%. With graph-based context, they drop to the same 5-8% range regardless of codebase size. The graph scales; guessing doesn't.

Practical Hallucination Reduction Beyond Tools

Context engines handle the structural side of hallucination prevention. But several practices reduce hallucinations regardless of your tooling:

Be Specific in Prompts

"Fix the auth bug" gives the AI maximum room to hallucinate. "Fix the JWT expiration check in `src/auth/verifyToken.ts` that returns true for expired tokens" gives the AI a specific target with a real file path. Specific prompts reduce hallucinations by 30-40% even without additional context tools.

Include Real Examples

When asking the AI to create something new, include an example of an existing similar component from your codebase. "Create a new API endpoint following the same pattern as `GET /api/users` in `src/routes/users.ts`" grounds the AI in your actual patterns instead of training data patterns.

Validate Imports Immediately

Before reviewing generated logic, check every import statement. If the AI hallucinated an import, the rest of the code is likely built on that hallucination. Catching fake imports first saves you from debugging downstream issues that trace back to a non-existent module.

Use Type Checking as a Hallucination Detector

TypeScript's compiler is an excellent hallucination detector. Run `tsc --noEmit` on AI-generated code immediately. Type errors often reveal hallucinated function signatures, wrong argument counts, and non-existent module references before they reach runtime.

Break Complex Tasks into Verifiable Steps

Instead of asking the AI to build a complete feature in one shot, break it into steps that you can verify independently. "First, show me the existing database schema for users" → verify the response → "Now add a `lastLogin` field to that schema" → verify → continue. Each verification step catches hallucinations before they compound.

The Context Quality Equation

Hallucinations in AI-generated code aren't a model problem that will be fixed by GPT-5 or Claude 5. They're a context problem that exists today because AI models fill knowledge gaps with statistical guesses.

The equation is simple: hallucination rate = f(knowledge gaps). Reduce the gaps, reduce the hallucinations. Eliminate the gaps, eliminate the hallucinations.

Dependency graphs eliminate structural knowledge gaps — what exists, what depends on what, what signatures look like, where files live. Specific prompts eliminate intent knowledge gaps. Real examples eliminate pattern knowledge gaps.

The developers who report the lowest hallucination rates aren't using better models. They're providing better context. The model's ceiling is set by the context it receives. Raise the context quality, and the ceiling rises with it.

Every hallucinated import you fix, every invented API you replace, every wrong signature you correct — that's time you could reclaim by investing in context quality. The math isn't subtle: a developer spending 15 minutes per day fixing hallucinations spends 5+ hours per month on corrections. At a $75/hour loaded cost, that's $375/month in wasted engineering time — more than the cost of the AI tool itself.

Fix the context. The hallucinations fix themselves.

Frequently Asked Questions

What percentage of AI-generated code contains hallucinations?
Studies tracking AI coding accuracy across real-world projects show that 15-25% of AI code suggestions contain hallucinated elements — non-existent imports, wrong function signatures, or invented file paths. On large codebases (100K+ lines), the rate can reach 30-35%. With structured context from dependency graphs, the rate drops to 5-8% regardless of codebase size.
Are hallucinations different from regular coding bugs?
Yes, fundamentally. A regular bug is wrong logic applied to real code — the function exists but the logic is incorrect. A hallucination is correct-looking logic applied to imaginary code — the AI invents functions, modules, or APIs that don't exist in your codebase. Hallucinations require different solutions: bugs are fixed with better reasoning, while hallucinations are fixed with better context that shows the AI what actually exists.
Will better AI models eliminate code hallucinations?
Not by themselves. Hallucinations occur when the model lacks information about your specific codebase and fills the gap with training data patterns. A more powerful model might guess more accurately, but it's still guessing. The only way to eliminate structural hallucinations is to provide structural context — verified dependency graphs, real file paths, actual function signatures. Better models plus better context reduces hallucinations; better models alone just produces more confident-sounding hallucinations.
How can I quickly detect hallucinations in AI-generated code?
Three fast checks: First, verify every import statement — if an import doesn't resolve, the downstream code is likely hallucinated too. Second, run your type checker (e.g., `tsc --noEmit` for TypeScript) immediately on generated code, as type errors frequently reveal wrong function signatures and non-existent modules. Third, check file paths referenced in the code against your actual project structure. These three checks catch 80-90% of hallucinations before they reach runtime.
Does providing more files as context reduce hallucinations?
More context helps, but only if it's the right context. Dumping 20 random files into the context window can actually increase hallucinations by overwhelming the model with irrelevant information and leaving less room for relevant context. Structured context — dependency graphs showing verified symbols and relationships — is far more effective than raw file content. A 5,000-token context capsule with graph-ranked dependencies prevents more hallucinations than a 50,000-token dump of raw files.

Nicola

Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.

Related Articles