Claude Code Subagents: Complete Guide to Custom AI Workers

Claude Code Subagents: Complete Guide to Custom AI Workers
Claude Code can spawn child processes that work in parallel. These subagents — independent Claude instances running inside your terminal — are the closest thing AI coding has to multithreading. Most developers never use them. The ones who do report 40-60% faster completion on complex, multi-file tasks.
But subagents have a critical limitation that nobody talks about: they start blind. Each one gets a fresh context window with zero codebase knowledge. That's the difference between subagents that accelerate your workflow and subagents that waste tokens reading files you already understand.
Here's how to use them properly.
What Subagents Actually Are
A subagent is a separate Claude instance spawned by your main Claude Code session. It runs as a child process with its own context window, its own tool access, and its own conversation history. The parent agent delegates a task, the subagent executes it independently, and the result flows back.
Think of it like forking a process in Unix. The subagent inherits the task description from the parent but not the parent's accumulated context. It starts fresh — which is both its greatest strength and its most expensive weakness.
Key characteristics of subagents:
- Isolated context window — Each subagent gets its own token budget, separate from the parent
- Parallel execution — Multiple subagents can run simultaneously on different tasks
- Independent tool access — Subagents can read files, run commands, search code, and use MCP tools
- Scoped output — Results are summarized back to the parent agent when the subagent completes
- No shared state — Subagents cannot see what other subagents or the parent are doing in real time
This isolation is intentional. It prevents context contamination between parallel tasks and keeps each subagent focused on its specific job.
The Three Types of Subagents
Claude Code supports three distinct subagent modes, each optimized for different workloads.
General-Purpose Subagents
The default mode. You describe a task, and Claude Code spawns a subagent to handle it. The subagent has full tool access — it can read files, write code, run shell commands, and interact with MCP servers.
Use general-purpose subagents for tasks that require code changes: implementing a feature in a specific module, fixing a bug in an isolated component, or writing tests for a particular function.
Explore Subagents
Read-only subagents designed for research and investigation. Explore subagents can read files and search code but cannot make changes. They're cheaper to run (lower token usage) and safer to deploy in parallel because they can't create conflicting edits.
Use Explore subagents when you need to understand something: tracing a call chain, finding all usages of a function, investigating how a system works before modifying it, or auditing code for patterns.
Plan Subagents
Analysis-focused subagents that produce structured plans rather than code changes. A Plan subagent investigates a problem, considers approaches, and returns a detailed implementation plan — but doesn't execute it.
Use Plan subagents for architectural decisions, refactoring strategies, or breaking complex tasks into subtasks that other subagents can execute.
When to Use Subagents (And When Not To)
Subagents shine when your task is decomposable — meaning it can be split into independent pieces that don't depend on each other's output.
High-value subagent scenarios:
- Parallel test writing — Spawn one subagent per module to write tests simultaneously. A test suite that takes 45 minutes sequentially finishes in 12 minutes with 4 parallel subagents.
- Multi-file code review — Each subagent reviews a different file or directory. You get comprehensive feedback without burning your main session's context on files you won't edit.
- Isolated bug fixes — When you have 3 independent bugs, subagents can investigate and fix each one in parallel.
- Research tasks ��� Explore subagents can trace how a feature works across the codebase while your main session continues other work.
- Cross-package updates — In monorepos, spawn subagents per package to make coordinated but independent changes.
When subagents are the wrong choice:
- Sequential dependencies — If task B needs the output of task A, subagents don't help. You'll wait for A to finish anyway.
- Small, fast tasks — The overhead of spawning a subagent (context initialization, task description, result summarization) isn't worth it for tasks that take 30 seconds sequentially.
- Heavily interconnected changes — If every file you're changing imports from every other file you're changing, parallel edits create merge conflicts.
Creating Effective Subagent Tasks
The quality of a subagent's work depends almost entirely on the quality of the task description you give it. Vague tasks produce vague results. Specific tasks produce targeted, high-quality output.
The Task Description Formula
Every subagent task should include four elements:
- Objective — What exactly should the subagent produce?
- Scope — Which files, directories, or modules should it focus on?
- Constraints — What should it NOT do? What patterns should it follow?
- Context — What background information does it need to understand the task?
A bad task description: "Write tests for the auth module."
A good task description: "Write unit tests for `src/auth/validateToken.ts`. Test the happy path (valid JWT with correct claims), expired token, malformed token, missing required claims (role, userId), and token signed with wrong key. Use vitest. Mock the `jsonwebtoken` library. Follow the test patterns in `src/auth/__tests__/hashPassword.test.ts`."
The difference in output quality is dramatic. The good description eliminates 80% of the subagent's exploration overhead — it doesn't need to discover the test framework, find existing test patterns, or figure out which edge cases matter.
Passing Context to Subagents
Here's the critical problem: your main session has been accumulating context for 20 minutes. You understand the codebase structure, the relevant files, the coding patterns. When you spawn a subagent, none of that transfers.
The subagent starts cold. It will spend its first 500-2000 tokens just reading files to build the understanding you already have. Multiply that by 4 parallel subagents, and you've burned 2000-8000 tokens on redundant exploration.
Three strategies to minimize this waste:
Strategy 1: Inline context in the task. Include relevant code snippets, type definitions, or architectural notes directly in the task description. This front-loads context so the subagent doesn't need to discover it.
Strategy 2: Point to specific files. Instead of "look at the auth module," say "read `src/auth/types.ts` for the Token interface and `src/auth/middleware.ts` for how tokens are validated in the request pipeline." Specific file paths eliminate search overhead.
Strategy 3: Use a context engine. This is the most token-efficient approach. Tools like vexp pre-compute which code symbols are relevant to a given task using a dependency graph. When you spawn a subagent, it calls `run_pipeline` with its specific task and immediately receives graph-ranked, pre-scoped context — no file exploration needed. Each subagent gets exactly the code relevant to its subtask, typically in 200-400 tokens instead of 2000+.
The Context Problem in Detail
Let's quantify how subagent context waste compounds.
A typical subagent workflow without context optimization:
- Parent session has been running for 15 minutes, accumulated 25K tokens of context
- Parent spawns 4 subagents for parallel tasks
- Each subagent spends ~1500 tokens exploring the codebase to understand structure
- Each subagent reads 3-5 files to understand relevant patterns (~2000 tokens each)
- Each subagent then does the actual work (~3000 tokens each)
Total subagent cost: 4 x 6500 = 26,000 tokens. Of that, 14,000 tokens (54%) were spent on exploration and context building that the parent session already completed.
The same workflow with pre-scoped context:
- Parent session has been running for 15 minutes
- Parent spawns 4 subagents, each with pre-scoped context from a dependency graph
- Each subagent starts with relevant code already in context (~400 tokens)
- Each subagent does the actual work (~3000 tokens each)
Total subagent cost: 4 x 3400 = 13,600 tokens. That's a 48% reduction — and the subagents produce better output because their context is signal-dense rather than exploration-heavy.
Best Practices for Subagent Workflows
After working with subagent-heavy workflows across dozens of projects, these patterns consistently produce the best results.
1. Plan First, Then Parallelize
Don't jump straight to parallel subagents. Use a Plan subagent (or your main session) to analyze the task, identify independent subtasks, and define clear boundaries between them. The 2 minutes you spend planning saves 10 minutes of subagent thrashing.
2. Keep Subagent Tasks Small
The ideal subagent task takes 3-8 minutes to complete. Smaller tasks have tighter scope, produce more focused output, and are easier to verify. If a subagent task would take 20+ minutes, break it down further.
3. Use Explore Before Execute
Before spawning general-purpose subagents to make changes, spawn an Explore subagent to investigate the codebase structure. Feed its findings into the task descriptions for your execution subagents. This two-phase approach produces significantly better results than blind parallel execution.
4. Pre-scope Context for Every Subagent
Never let subagents explore the codebase from scratch. Either include relevant context inline, point to specific files, or use a context engine like vexp to serve pre-computed dependency-ranked code. The token savings compound linearly with the number of subagents — 4 subagents with pre-scoped context save roughly 10,000 tokens compared to blind exploration.
5. Define Clear File Boundaries
When running parallel subagents that make code changes, explicitly assign non-overlapping file sets to each subagent. "Subagent A modifies `src/auth/*`, subagent B modifies `src/api/*`." Overlapping file assignments cause merge conflicts and wasted work.
6. Verify Subagent Output
Subagents work in isolation, so they can't see the effects of other subagents' changes. After all subagents complete, run your test suite and review the combined changes for consistency. Integration issues typically manifest as import errors, type mismatches, or duplicated utility functions.
7. Use CLAUDE.md to Set Conventions
Every subagent reads your project's `CLAUDE.md` file. Put coding conventions, test patterns, and architectural rules there. This gives subagents project-specific context without burning tokens on inline instructions — and it stays consistent across all subagents.
Advanced: Chaining Subagent Stages
For complex tasks, use a multi-stage subagent pipeline:
Stage 1 — Research (Explore subagents)
Spawn 2-3 Explore subagents to investigate different aspects of the task. One traces the data flow, another maps the dependency graph, a third audits existing tests.
Stage 2 — Plan (Plan subagent)
Feed the Explore results into a Plan subagent. It produces a structured implementation plan with specific file changes, ordered by dependency.
Stage 3 — Execute (General-purpose subagents)
Spawn execution subagents based on the plan. Each gets a clear, well-scoped task with all necessary context pre-loaded.
Stage 4 — Verify (Explore subagent)
A final Explore subagent reviews all changes for consistency, runs tests, and flags any integration issues.
This pipeline takes more time to set up than ad-hoc subagent spawning, but it produces dramatically better results on complex tasks — refactors, feature additions, or cross-cutting concerns that touch 10+ files.
The Compounding Effect
Subagents don't just parallelize work — they change the economics of AI-assisted coding. A developer who masters subagent workflows can tackle tasks that would otherwise blow through a single context window. The key is keeping each subagent's context lean, focused, and pre-scoped.
The math is straightforward. Four subagents with pre-scoped context cost roughly the same as one agent doing everything sequentially — but they finish in 25% of the time. At scale, that's the difference between a 30-minute refactor and a 2-hour slog through context window limits.
Frequently Asked Questions
What are Claude Code subagents?
How many subagents can I run in parallel with Claude Code?
Do subagents share context with the parent Claude Code session?
How do I reduce token waste when using Claude Code subagents?
When should I NOT use Claude Code subagents?
Nicola
Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.
Related Articles

Vibe Coding Is Fun Until the Bill Arrives: Token Optimization Guide
Vibe coding with AI is addictive but expensive. Freestyle prompting without context management burns tokens 3-5x faster than structured workflows.

Windsurf Credits Running Out? How to Use Fewer Tokens Per Task
Windsurf credits deplete fast because the AI processes too much irrelevant context. Reduce what it needs to read and your credits last 2-3x longer.

Best AI Coding Tool for Startups: Balancing Cost, Speed, and Quality
Startups need speed and budget control. The ideal AI coding stack combines a free/cheap agent with context optimization — here's how to set it up.