Claude Code Model Switching: Use Haiku for 80% of Tasks and Save Big

Claude Code Model Switching: Use Haiku for 80% of Tasks and Save Big
Haiku costs $0.80 per million input tokens and $4 per million output tokens. Sonnet costs $3/$15. Opus costs $15/$75. That means Haiku is 3.75x cheaper than Sonnet and 18.75x cheaper than Opus per token.
On a typical coding task, Haiku costs $0.05. The same task on Sonnet costs $0.17. On Opus, $0.83. If 80% of your daily tasks can run on Haiku, your Claude Code bill drops by 70-80% compared to Sonnet-only usage — and by over 90% compared to Opus-only usage.
The catch: Haiku is extremely sensitive to context quality. Give it vague context, and it produces vague output. Give it precise, graph-ranked context, and it handles the vast majority of everyday coding tasks as well as Sonnet. That prerequisite changes everything.
Haiku's Price Advantage in Real Numbers
Let's put concrete numbers on the three models across realistic usage levels.
Cost per task (average task: ~30K input + ~5K output tokens):
| Model | Input Cost | Output Cost | Total/Task |
|---|---|---|---|
| Haiku | $0.024 | $0.020 | $0.044 |
| Sonnet | $0.090 | $0.075 | $0.165 |
| Opus | $0.450 | $0.375 | $0.825 |
Daily cost (10 tasks/day, single-model usage):
| Model | Daily Cost | Monthly (20 days) | Annual |
|---|---|---|---|
| Haiku | $0.44 | $8.80 | $105.60 |
| Sonnet | $1.65 | $33.00 | $396.00 |
| Opus | $8.25 | $165.00 | $1,980.00 |
At $8.80/month for Haiku-only usage, you're spending less than half the cost of a Pro subscription — on the API plan. Even if you only shift 50% of tasks to Haiku, the savings are substantial.
What Haiku Handles Well
Haiku is a smaller, faster model. It excels at tasks where the pattern is clear, the scope is narrow, and the answer follows directly from the provided context. That describes a surprising amount of daily coding work.
Simple Bug Fixes
Null checks, off-by-one errors, missing return statements, incorrect variable references, typos in string literals. Bugs where the fix is obvious once you see the problem. Haiku spots these instantly and applies the correct fix — often faster than Sonnet or Opus because it doesn't over-analyze.
Boilerplate and Scaffolding
Creating a new React component from a template. Adding a new API route that follows the same pattern as existing routes. Writing a database migration that adds columns. Setting up a new test file with standard imports and describe blocks. These are pattern-replication tasks, and Haiku replicates patterns perfectly.
Formatting and Style Changes
Adding TypeScript types to untyped code. Converting `var` to `const`/`let`. Reorganizing imports. Applying consistent naming conventions. Converting callback-style code to async/await. Mechanical transformations where the "what to do" is completely specified.
Test Writing (Given Examples)
If your codebase has existing test patterns, Haiku writes new tests that follow those patterns extremely well. Give it a function and an example test from the same file, and it produces comprehensive test coverage that matches the project's testing style.
Documentation
JSDoc comments, inline comments, README sections, API endpoint documentation. Given the code and its purpose, Haiku writes clear, accurate documentation. It's fast, and the output quality is indistinguishable from Sonnet's for this task type.
Configuration Files
Docker configurations, CI/CD pipelines, ESLint rules, Prettier settings, package.json scripts, environment variable templates. These are structured, well-documented formats where Haiku's pattern-matching ability shines.
The 80% Rule
Add up simple fixes, boilerplate, formatting, tests, docs, and config. For most developers, this accounts for 70-80% of daily tasks. These are the tasks you do between the hard problems — the connective tissue of software development. They don't require deep reasoning, and they're perfectly suited for Haiku.
The Context Prerequisite
Here's where most developers go wrong with Haiku: they try it, get mediocre results, and conclude it's not good enough for coding. The problem isn't Haiku — it's the context.
Haiku is extremely sensitive to context quality. More so than Sonnet, and far more than Opus.
Without a context engine, Claude Code explores your codebase by reading files. Opus can read 5 files and infer the full picture. Sonnet needs 10-15 files to reach the same understanding. Haiku needs every relevant file explicitly in context — it doesn't infer well across gaps.
This means:
- Haiku + blind exploration = poor results. It reads files but struggles to synthesize the information into a coherent picture. It misses dependencies, forgets type definitions from earlier reads, and produces code that doesn't integrate well.
- Haiku + pre-computed context = excellent results. When a context engine provides the exact function signatures, type definitions, import chains, and dependency relationships, Haiku has everything it needs. No inference required. It produces output that matches Sonnet quality for routine tasks.
A dependency-graph context engine like vexp makes Haiku viable for production coding by eliminating the information gap. Instead of relying on Haiku's weaker inference capability, you give it complete, pre-ranked context. The model doesn't need to be smart about finding information — it just needs to be smart about using information it already has. And Haiku is very good at that.
The Three-Tier Model Strategy
The optimal cost structure uses all three models, each for its strength:
Tier 1: Haiku — 80% of Tasks
Use Haiku for everything in the "handles well" list above. Start every session with Haiku as the default. With a context engine providing pre-computed context, Haiku handles these tasks quickly and accurately.
Best for: Simple fixes, boilerplate, formatting, tests, docs, configs, any task where the pattern is clear and the scope is narrow.
Switch up when: Haiku's output is wrong after one attempt, or the task requires reasoning across multiple files.
Tier 2: Sonnet — 15% of Tasks
Upgrade to Sonnet for tasks that require moderate reasoning — feature implementation with some architectural decisions, bug fixes that span 3-5 files, refactors with cascading changes, or any task where Haiku produced incorrect output.
Best for: Feature implementation, multi-file bug fixes, moderate refactors, test suites for complex logic.
Switch up when: Sonnet's output is wrong or incomplete after one retry, or the task involves 5+ interdependent files with novel patterns.
Tier 3: Opus — 5% of Tasks
Reserve Opus for the genuinely hard problems. Complex multi-file refactors, subtle concurrency bugs, novel architectural patterns, large-scale code generation. These are the tasks where Opus's 5x price premium delivers 5x the value.
Best for: Complex refactors, subtle bugs, novel architecture, large code generation, problems where Sonnet failed.
Daily Cost Comparison
Here's what the three-tier strategy looks like compared to single-model approaches for a developer doing 10 tasks/day:
| Strategy | Daily Cost | Monthly (20 days) | Annual |
|---|---|---|---|
| Opus only | $8.25 | $165.00 | $1,980 |
| Sonnet only | $1.65 | $33.00 | $396 |
| Three-tier (no context engine) | $1.05 | $21.00 | $252 |
| Three-tier + context engine | $0.44 | $8.80 | $106 |
Three-tier breakdown (10 tasks/day):
- 8 tasks on Haiku: 8 x $0.044 = $0.35
- 1.5 tasks on Sonnet: 1.5 x $0.165 = $0.25
- 0.5 tasks on Opus: 0.5 x $0.825 = $0.41
- Daily total: $1.01
With context engine (58% token reduction across all models):
- 8 tasks on Haiku: 8 x $0.018 = $0.15
- 1.5 tasks on Sonnet: 1.5 x $0.069 = $0.10
- 0.5 tasks on Opus: 0.5 x $0.347 = $0.17
- Daily total: $0.42
At $0.42/day, your annual Claude Code cost is $100.80. That's less than a single month of Max 5x.
How to Implement Model Switching
Claude Code makes model switching straightforward with the `/model` command.
Setting Your Default
At the start of each session, Claude Code uses the model specified in your configuration. To default to Haiku, you can set it in your Claude Code settings or simply type `/model haiku` at the start of each session.
Switching Mid-Session
When you encounter a task that needs more capability:
- Type `/model sonnet` to upgrade to Sonnet
- Complete the task
- Type `/model haiku` to switch back
The switch is instant and applies to the next message. There's no session restart, no context loss.
Recognizing When to Upgrade
Develop a mental checklist:
- Stay on Haiku if: The task is a single file, follows an existing pattern, or is mechanical
- Switch to Sonnet if: The task spans 2-5 files, requires understanding business logic, or Haiku's first attempt was wrong
- Switch to Opus if: The task spans 5+ files with complex dependencies, involves a bug you can't reproduce, or Sonnet failed
After a week of conscious switching, the pattern becomes automatic. You'll instinctively know which model fits each task.
Making Haiku Viable: The Context Engine Requirement
Without a context engine, using Haiku for 80% of tasks is aspirational. You'll find it fails on too many tasks, forcing frequent upgrades to Sonnet. The failure rate without good context is around 40-50% for anything beyond trivial changes.
With a context engine like vexp, Haiku's failure rate on routine tasks drops to 5-10%. The engine provides:
- Exact function signatures — Haiku knows the types without reading files
- Dependency chains — Haiku sees which modules connect without exploring
- Related test patterns — Haiku follows existing test conventions without searching
- Type definitions — Haiku gets full type context without importing
This transforms Haiku from a model that "works for trivial tasks" to one that "works for most tasks." The context engine compensates for Haiku's weaker inference capability by providing the information that Sonnet and Opus would have figured out through exploration.
The investment is minimal — vexp sets up in under 5 minutes as an MCP server and indexes most codebases in seconds. The return is a 70-80% reduction in Claude Code costs by making Haiku viable for the majority of your work.
The Bigger Picture
Model switching isn't about settling for less. It's about matching capability to requirement. Using Opus for a simple bug fix is like hiring a senior architect to change a light bulb — you get the same result at 18x the cost.
The three-tier strategy — Haiku 80%, Sonnet 15%, Opus 5% — captures 95%+ of the output quality at 5-10% of the all-Opus cost. Add a context engine to make Haiku reliable, and you're looking at annual Claude Code costs under $110.
That's not a compromise. It's an optimization.
Frequently Asked Questions
Can Claude Code Haiku really handle 80% of coding tasks?
How much cheaper is Haiku compared to Sonnet and Opus in Claude Code?
How do I switch models in Claude Code?
What tasks should I NOT use Haiku for?
What is the optimal model split for minimizing Claude Code costs?
Nicola
Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.
Related Articles

Vibe Coding Is Fun Until the Bill Arrives: Token Optimization Guide
Vibe coding with AI is addictive but expensive. Freestyle prompting without context management burns tokens 3-5x faster than structured workflows.

Windsurf Credits Running Out? How to Use Fewer Tokens Per Task
Windsurf credits deplete fast because the AI processes too much irrelevant context. Reduce what it needs to read and your credits last 2-3x longer.

Best AI Coding Tool for Startups: Balancing Cost, Speed, and Quality
Startups need speed and budget control. The ideal AI coding stack combines a free/cheap agent with context optimization — here's how to set it up.