Your AI Coding Agent Reads Too Many Files — Here's the Fix

Nicola·March 18, 2026

Your AI Coding Agent Reads Too Many Files — Here's the Fix

You ask an AI coding agent a question about a single function. It responds by reading 15 files. By the time it answers, your context window is half-full, you’ve spent real money on tokens, and the final explanation only needed 3 of those files.

This isn’t a bug in Claude Code or any other agent. It’s structural: without a code graph, the agent has no reliable way to know which files are relevant before it reads them. So it reads broadly and hopes.

This post explains why that happens, what it costs, and how graph-based context fixes it.

Why Agents Read Too Many Files

When you give an AI coding agent a task, it needs context about your codebase. It typically has three options:

Ask you to provide the relevant files manually
Search the filesystem for potentially relevant files
Read files based on heuristics (e.g. directories that sound relevant)

Each option has serious drawbacks.

1. Manual file selection

You tell the agent exactly which files matter. This works, but only if you already understand the dependency surface of the problem.

For simple bugs, you often do: you know the function, the module, and maybe one or two helpers. For cross-module issues, refactors, or unfamiliar code, you usually don’t know the full blast radius.

2. Filesystem search

The agent searches by filename or directory structure. That’s a weak proxy for actual dependencies.

A function you care about might live in utils/helpers.ts alongside a dozen unrelated utilities. A filename match doesn’t tell you which symbols are actually involved in the behavior you’re debugging.

3. Heuristic directory reads

This is the most common pattern:

“The bug is in the payment module, so let me read everything in src/payments/.”

The agent lists the directory, then reads file after file because it has no better signal than “this folder name sounds relevant.” That’s how you end up with 15-file reads for a question that truly depends on 3–5 files.

The core problem: none of these approaches use the real signal — the dependency structure of your code.

Which functions call which functions?
Which types are used where?
Which modules actually participate in this behavior?

Without a dependency graph, the agent is navigating your codebase blind.

For a deeper dive into what a dependency graph is and how it works for AI coding, see:

What Is a Dependency Graph for AI Coding?

The Cost of Over-Reading

Over-reading isn’t just an aesthetic problem. It has three concrete costs: tokens, context quality, and time.

1. Token cost

Every file read consumes tokens. A typical TypeScript file is ~2,000–4,000 tokens. Reading 15 such files means 30,000–60,000 tokens of raw code in your context.

At Claude Sonnet pricing, that’s roughly $0.09–$0.18 in input tokens per task just for file content, before any conversation history.

If a developer does 20 such tasks per day with over-reading, that’s $1.80–$3.60/day in unnecessary file-reading costs. At team scale, this compounds into thousands of dollars per month.

2. Context quality cost

More context is not always better context. Irrelevant files are noise. They:

Dilute the signal from the truly relevant code
Increase the chance the model latches onto the wrong pattern
Make it harder for the agent to keep the right invariants in working memory

Counterintuitively, less but more relevant context often produces better answers than a huge, noisy context window.

3. Time cost

Each file read is a tool call. Tool calls typically add 500–2000ms of latency.

Reading 15 files can easily add 7–30 seconds before you see any answer at all. For interactive development, that latency is painful and breaks flow.

What Good File Selection Looks Like

For any given question about your code, there’s a minimal set of files that contains everything needed to answer correctly.

For a typical bug fix, that’s usually 3–5 files
For a feature addition touching multiple modules, maybe 7–10 files

Good file selection is about discovering that minimal set using the dependency graph, starting from the symbols your question touches.

A good process looks like this:

Identify pivot symbols

Extract the functions, classes, and types directly mentioned or implied by your question. These are the pivots.

Traverse outward along the graph

From each pivot, follow edges:

What do these pivots call?
What calls them?
What types do they use or return?

Apply depth limits

Don’t traverse to depth 10. In practice, 2–3 hops from the pivots captures the essential context for most tasks.

Compress intelligently

For pivots and very close neighbors: include full bodies
For more distant context: include signatures, type definitions, or summaries only

The result:

Instead of 15 files, you might include 4
Instead of 45,000 tokens, you might include 8,000
The agent sees exactly what matters, with far less noise

Three Techniques to Reduce Over-Reading Right Now

You don’t need a full-blown context engine to start improving file selection. Here are three practical techniques you can use immediately.

1. Specify which files matter

When you give a task, explicitly constrain the scope:

“The bug is in src/auth/session.ts. It calls validateToken in src/auth/jwt.ts. Both files are relevant; nothing else should be needed.”

For tasks you understand reasonably well, this is ~20 seconds of typing that can eliminate 10+ unnecessary file reads.

2. Use file-level, not directory-level context

When using @mentions or file references with Claude Code (or similar tools), reference specific files, not directories.

✅ @src/auth/session.ts
✅ @src/auth/jwt.ts
❌ @src/auth/

Referencing a directory usually triggers a directory listing followed by broad file reads. Referencing specific files keeps the agent focused.

3. Set explicit constraints

Agents respond well to explicit scope boundaries. For example:

“The issue is isolated to processPayment. Please don’t read files outside of the payment module unless you have a specific, stated reason.”

Or:

“You may read at most 3 additional files beyond the ones I’ve provided. If you think you need more, explain why first.”

These constraints nudge the agent away from reflexive over-reading and toward more deliberate navigation.

For more patterns like this, see:

How to Give Your AI Coding Agent Better Context (Automatically)

The Automated Fix: Graph-Based Context

Manual techniques help when you already understand your codebase. But for:

Debugging unfamiliar or legacy code
Working in large, inherited monorepos
Refactoring cross-cutting concerns

…you need automated, graph-based context selection.

Here’s how that works.

Indexing phase

A code graph is built from your repository:

Every function, class, and type becomes a node
Every call, import, and type reference becomes an edge

This indexing runs once and then updates incrementally as your code changes.

Query phase

When you ask a question, the system:

Parses your query to identify pivot symbols (e.g. processPayment, UserSession, PaymentError)
Maps those pivots into the graph
Traverses outward along call, import, and type edges

High-centrality nodes (code that is widely used or frequently called) are prioritized, because they’re more likely to be relevant to behavior.

Filtering phase

The collected symbols are then filtered and ranked:

Nodes more than 3 hops from any pivot are usually excluded
Nodes that are rarely used or clearly unrelated are deprioritized

The output is a precise, token-efficient context slice: the minimal set of symbols that captures the behavior you’re asking about.

Delivery phase

Finally, the context slice is formatted for the AI agent:

Pivots: full code bodies
Close neighbors: full bodies or detailed snippets
Distant but important nodes: signatures, type definitions, or short summaries
Everything else: omitted

This entire process runs in under a second. The agent never needs to “spray and pray” across your filesystem because it’s operating on a precomputed map of your code.

Benchmark: Targeted vs. Broad File Reading

On a FastAPI application with 7 representative coding tasks, we compared broad reading vs. graph-based selection.

Broad reading (no graph):

Average files read per task: 11.3
Average input tokens: ~48,000
Task completion time: baseline
Task success rate: baseline

Graph-based selection:

Average files read per task: 3.8
Average input tokens: ~14,000 (65–70% fewer)
Task completion time: 22% faster
Task success rate: higher (fewer hallucinations from noisy context)

The key finding: quality improved even though the agent saw less code. Targeted context beats broad context on cost, latency, and correctness.

For a deeper breakdown of token savings and patterns, see:

How to Reduce Claude Code Token Usage by 58%

The Long-Term Cost of the Status Quo

Over-reading compounds over time:

Every unnecessary file adds tokens
More tokens increase cost and dilute context
Diluted context leads to weaker answers and more iterations
More iterations trigger more file reads

A team of 5 developers with heavy AI usage and unconstrained file reading can easily spend $2,000–$5,000/month on unnecessary token usage alone.

This isn’t a problem you can “discipline” your way out of. As long as the agent lacks a code graph, it will keep guessing which files to read.

The fix is architectural, not behavioral:

Give the agent a dependency graph
Let it select context via graph traversal
Use manual constraints as a safety rail, not the primary mechanism

FAQ

Can’t the agent just get better at picking files?

Models have improved at code navigation, but without a pre-built graph they face a fundamental limit: they must read something to know whether it’s relevant.

A pre-built graph breaks that loop by letting the agent reason over structure without first ingesting every file.

Does over-reading affect quality or just cost?

Both.

Over-reading is like reading 15 books to answer a question that’s fully answered in 3 of them. You can find the answer, but:

It takes longer
You’re more likely to get distracted by irrelevant details
You may misinterpret the key signal

The same thing happens with large, noisy contexts.

How is this different from just having a good codebase structure?

Good directory and module structure helps, but it’s not enough.

Real-world codebases have:

Cross-cutting concerns (logging, auth, metrics)
Shared utilities and helpers
Types and interfaces reused across modules

These relationships don’t always align with directory boundaries. A graph captures actual dependencies, not just organizational intent.

What if my codebase is small?

For repos under ~100 files, over-reading is less painful:

Token costs are lower
The risk of severe noise is smaller

But as your codebase grows, the cost curve steepens. Starting with graph-based context early helps you avoid expensive habits later.

Does this work for polyglot repos?

Yes, if your graph engine supports multiple languages.

For example, vexp supports TypeScript, JavaScript, Python, Go, Rust, Java, C#, and others. Mixed-language repos benefit from cross-language dependency tracking — e.g. tracing a request from a TypeScript frontend through a Python backend into a Go microservice.

Frequently Asked Questions

Why does my AI coding agent read so many files?

AI coding agents explore broadly because they lack a structural understanding of your codebase. Without a dependency graph, the only strategy is to search directories, grep for patterns, and load files incrementally — hoping to find the relevant code. This results in loading 10-30 files when only 3-5 are actually needed.

How many files does an AI agent actually need per task?

For most coding tasks (bug fixes, feature additions, refactors), the truly relevant code is concentrated in 3-8 files that are directly connected through imports and call relationships. The rest — often 70-80% of loaded files — are exploration overhead that doesn't contribute to the final answer.

Does reading too many files affect AI coding quality?

Yes. Loading unnecessary files fills the context window with noise, which can cause the model to reference wrong patterns, use deprecated APIs from unrelated files, or generate code that follows conventions from the wrong part of the codebase. Focused context produces more accurate results.

What is the fix for excessive file reading?

Use a context engine that pre-indexes your codebase into a dependency graph. When you describe a task, it traverses actual import and call relationships to identify only the connected files and functions — no directory walks, no speculative reads. Tools like vexp do this via MCP integration with 12+ supported agents.

Can I control which files Claude Code reads?

You can manually point Claude at specific files using @mentions, but this requires you to already know which files are relevant. A better approach is using a context engine that automatically determines the minimal relevant file set based on your task description and the codebase's dependency structure.

Nicola

Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.

Cost & Optimization

Vibe Coding Is Fun Until the Bill Arrives: Token Optimization Guide

Vibe coding with AI is addictive but expensive. Freestyle prompting without context management burns tokens 3-5x faster than structured workflows.

Nicola·May 25, 2026

Context Engineering

Code Indexing for AI Agents: Embeddings vs Dependency Graphs vs RAG

Three approaches to code indexing for AI: embeddings, dependency graphs, and RAG. Each has trade-offs in accuracy, token efficiency, and maintenance cost.

Nicola·May 22, 2026

Context Engineering

RAG for Code: Retrieval-Augmented Generation in AI Development

RAG retrieves relevant code from your codebase before the AI generates a response. But vector-based RAG misses structural relationships that matter for coding.

Nicola·May 21, 2026

Your AI Coding Agent Reads Too Many Files — Here's the Fix

Why Agents Read Too Many Files

1. Manual file selection

2. Filesystem search

3. Heuristic directory reads

The Cost of Over-Reading

1. Token cost

2. Context quality cost

3. Time cost

What Good File Selection Looks Like

Three Techniques to Reduce Over-Reading Right Now

1. Specify which files matter

2. Use file-level, not directory-level context

3. Set explicit constraints

The Automated Fix: Graph-Based Context

Indexing phase

Query phase

Filtering phase

Delivery phase

Benchmark: Targeted vs. Broad File Reading

The Long-Term Cost of the Status Quo

FAQ

Can’t the agent just get better at picking files?

Does over-reading affect quality or just cost?

How is this different from just having a good codebase structure?

What if my codebase is small?

Does this work for polyglot repos?

Frequently Asked Questions

Related Articles

Vibe Coding Is Fun Until the Bill Arrives: Token Optimization Guide

Code Indexing for AI Agents: Embeddings vs Dependency Graphs vs RAG

RAG for Code: Retrieval-Augmented Generation in AI Development