Claude Code Model Switching: Use Haiku for 80% of Tasks and Save Big

Nicola·
Claude Code Model Switching: Use Haiku for 80% of Tasks and Save Big

Claude Code Model Switching: Use Haiku for 80% of Tasks and Save Big

Haiku costs $0.80 per million input tokens and $4 per million output tokens. Sonnet costs $3/$15. Opus costs $15/$75. That means Haiku is 3.75x cheaper than Sonnet and 18.75x cheaper than Opus per token.

On a typical coding task, Haiku costs $0.05. The same task on Sonnet costs $0.17. On Opus, $0.83. If 80% of your daily tasks can run on Haiku, your Claude Code bill drops by 70-80% compared to Sonnet-only usage — and by over 90% compared to Opus-only usage.

The catch: Haiku is extremely sensitive to context quality. Give it vague context, and it produces vague output. Give it precise, graph-ranked context, and it handles the vast majority of everyday coding tasks as well as Sonnet. That prerequisite changes everything.

Haiku's Price Advantage in Real Numbers

Let's put concrete numbers on the three models across realistic usage levels.

Cost per task (average task: ~30K input + ~5K output tokens):

| Model | Input Cost | Output Cost | Total/Task |

|---|---|---|---|

| Haiku | $0.024 | $0.020 | $0.044 |

| Sonnet | $0.090 | $0.075 | $0.165 |

| Opus | $0.450 | $0.375 | $0.825 |

Daily cost (10 tasks/day, single-model usage):

| Model | Daily Cost | Monthly (20 days) | Annual |

|---|---|---|---|

| Haiku | $0.44 | $8.80 | $105.60 |

| Sonnet | $1.65 | $33.00 | $396.00 |

| Opus | $8.25 | $165.00 | $1,980.00 |

At $8.80/month for Haiku-only usage, you're spending less than half the cost of a Pro subscription — on the API plan. Even if you only shift 50% of tasks to Haiku, the savings are substantial.

What Haiku Handles Well

Haiku is a smaller, faster model. It excels at tasks where the pattern is clear, the scope is narrow, and the answer follows directly from the provided context. That describes a surprising amount of daily coding work.

Simple Bug Fixes

Null checks, off-by-one errors, missing return statements, incorrect variable references, typos in string literals. Bugs where the fix is obvious once you see the problem. Haiku spots these instantly and applies the correct fix — often faster than Sonnet or Opus because it doesn't over-analyze.

Boilerplate and Scaffolding

Creating a new React component from a template. Adding a new API route that follows the same pattern as existing routes. Writing a database migration that adds columns. Setting up a new test file with standard imports and describe blocks. These are pattern-replication tasks, and Haiku replicates patterns perfectly.

Formatting and Style Changes

Adding TypeScript types to untyped code. Converting `var` to `const`/`let`. Reorganizing imports. Applying consistent naming conventions. Converting callback-style code to async/await. Mechanical transformations where the "what to do" is completely specified.

Test Writing (Given Examples)

If your codebase has existing test patterns, Haiku writes new tests that follow those patterns extremely well. Give it a function and an example test from the same file, and it produces comprehensive test coverage that matches the project's testing style.

Documentation

JSDoc comments, inline comments, README sections, API endpoint documentation. Given the code and its purpose, Haiku writes clear, accurate documentation. It's fast, and the output quality is indistinguishable from Sonnet's for this task type.

Configuration Files

Docker configurations, CI/CD pipelines, ESLint rules, Prettier settings, package.json scripts, environment variable templates. These are structured, well-documented formats where Haiku's pattern-matching ability shines.

The 80% Rule

Add up simple fixes, boilerplate, formatting, tests, docs, and config. For most developers, this accounts for 70-80% of daily tasks. These are the tasks you do between the hard problems — the connective tissue of software development. They don't require deep reasoning, and they're perfectly suited for Haiku.

The Context Prerequisite

Here's where most developers go wrong with Haiku: they try it, get mediocre results, and conclude it's not good enough for coding. The problem isn't Haiku — it's the context.

Haiku is extremely sensitive to context quality. More so than Sonnet, and far more than Opus.

Without a context engine, Claude Code explores your codebase by reading files. Opus can read 5 files and infer the full picture. Sonnet needs 10-15 files to reach the same understanding. Haiku needs every relevant file explicitly in context — it doesn't infer well across gaps.

This means:

  • Haiku + blind exploration = poor results. It reads files but struggles to synthesize the information into a coherent picture. It misses dependencies, forgets type definitions from earlier reads, and produces code that doesn't integrate well.
  • Haiku + pre-computed context = excellent results. When a context engine provides the exact function signatures, type definitions, import chains, and dependency relationships, Haiku has everything it needs. No inference required. It produces output that matches Sonnet quality for routine tasks.

A dependency-graph context engine like vexp makes Haiku viable for production coding by eliminating the information gap. Instead of relying on Haiku's weaker inference capability, you give it complete, pre-ranked context. The model doesn't need to be smart about finding information — it just needs to be smart about using information it already has. And Haiku is very good at that.

The Three-Tier Model Strategy

The optimal cost structure uses all three models, each for its strength:

Tier 1: Haiku — 80% of Tasks

Use Haiku for everything in the "handles well" list above. Start every session with Haiku as the default. With a context engine providing pre-computed context, Haiku handles these tasks quickly and accurately.

Best for: Simple fixes, boilerplate, formatting, tests, docs, configs, any task where the pattern is clear and the scope is narrow.

Switch up when: Haiku's output is wrong after one attempt, or the task requires reasoning across multiple files.

Tier 2: Sonnet — 15% of Tasks

Upgrade to Sonnet for tasks that require moderate reasoning — feature implementation with some architectural decisions, bug fixes that span 3-5 files, refactors with cascading changes, or any task where Haiku produced incorrect output.

Best for: Feature implementation, multi-file bug fixes, moderate refactors, test suites for complex logic.

Switch up when: Sonnet's output is wrong or incomplete after one retry, or the task involves 5+ interdependent files with novel patterns.

Tier 3: Opus — 5% of Tasks

Reserve Opus for the genuinely hard problems. Complex multi-file refactors, subtle concurrency bugs, novel architectural patterns, large-scale code generation. These are the tasks where Opus's 5x price premium delivers 5x the value.

Best for: Complex refactors, subtle bugs, novel architecture, large code generation, problems where Sonnet failed.

Daily Cost Comparison

Here's what the three-tier strategy looks like compared to single-model approaches for a developer doing 10 tasks/day:

| Strategy | Daily Cost | Monthly (20 days) | Annual |

|---|---|---|---|

| Opus only | $8.25 | $165.00 | $1,980 |

| Sonnet only | $1.65 | $33.00 | $396 |

| Three-tier (no context engine) | $1.05 | $21.00 | $252 |

| Three-tier + context engine | $0.44 | $8.80 | $106 |

Three-tier breakdown (10 tasks/day):

  • 8 tasks on Haiku: 8 x $0.044 = $0.35
  • 1.5 tasks on Sonnet: 1.5 x $0.165 = $0.25
  • 0.5 tasks on Opus: 0.5 x $0.825 = $0.41
  • Daily total: $1.01

With context engine (58% token reduction across all models):

  • 8 tasks on Haiku: 8 x $0.018 = $0.15
  • 1.5 tasks on Sonnet: 1.5 x $0.069 = $0.10
  • 0.5 tasks on Opus: 0.5 x $0.347 = $0.17
  • Daily total: $0.42

At $0.42/day, your annual Claude Code cost is $100.80. That's less than a single month of Max 5x.

How to Implement Model Switching

Claude Code makes model switching straightforward with the `/model` command.

Setting Your Default

At the start of each session, Claude Code uses the model specified in your configuration. To default to Haiku, you can set it in your Claude Code settings or simply type `/model haiku` at the start of each session.

Switching Mid-Session

When you encounter a task that needs more capability:

  1. Type `/model sonnet` to upgrade to Sonnet
  2. Complete the task
  3. Type `/model haiku` to switch back

The switch is instant and applies to the next message. There's no session restart, no context loss.

Recognizing When to Upgrade

Develop a mental checklist:

  • Stay on Haiku if: The task is a single file, follows an existing pattern, or is mechanical
  • Switch to Sonnet if: The task spans 2-5 files, requires understanding business logic, or Haiku's first attempt was wrong
  • Switch to Opus if: The task spans 5+ files with complex dependencies, involves a bug you can't reproduce, or Sonnet failed

After a week of conscious switching, the pattern becomes automatic. You'll instinctively know which model fits each task.

Making Haiku Viable: The Context Engine Requirement

Without a context engine, using Haiku for 80% of tasks is aspirational. You'll find it fails on too many tasks, forcing frequent upgrades to Sonnet. The failure rate without good context is around 40-50% for anything beyond trivial changes.

With a context engine like vexp, Haiku's failure rate on routine tasks drops to 5-10%. The engine provides:

  • Exact function signatures — Haiku knows the types without reading files
  • Dependency chains — Haiku sees which modules connect without exploring
  • Related test patterns — Haiku follows existing test conventions without searching
  • Type definitions — Haiku gets full type context without importing

This transforms Haiku from a model that "works for trivial tasks" to one that "works for most tasks." The context engine compensates for Haiku's weaker inference capability by providing the information that Sonnet and Opus would have figured out through exploration.

The investment is minimal — vexp sets up in under 5 minutes as an MCP server and indexes most codebases in seconds. The return is a 70-80% reduction in Claude Code costs by making Haiku viable for the majority of your work.

The Bigger Picture

Model switching isn't about settling for less. It's about matching capability to requirement. Using Opus for a simple bug fix is like hiring a senior architect to change a light bulb — you get the same result at 18x the cost.

The three-tier strategy — Haiku 80%, Sonnet 15%, Opus 5% — captures 95%+ of the output quality at 5-10% of the all-Opus cost. Add a context engine to make Haiku reliable, and you're looking at annual Claude Code costs under $110.

That's not a compromise. It's an optimization.

Frequently Asked Questions

Can Claude Code Haiku really handle 80% of coding tasks?
Yes, but with an important caveat: Haiku needs high-quality context to perform well. With a dependency-graph context engine providing exact function signatures, type definitions, and dependency chains, Haiku handles simple fixes, boilerplate, formatting, tests, documentation, and configuration tasks — which collectively make up 70-80% of daily coding work. Without good context, Haiku's success rate on non-trivial tasks drops significantly.
How much cheaper is Haiku compared to Sonnet and Opus in Claude Code?
Haiku costs $0.80/$4 per million input/output tokens versus Sonnet's $3/$15 and Opus's $15/$75. Per typical coding task, Haiku costs about $0.044 compared to $0.165 for Sonnet and $0.825 for Opus. That makes Haiku 3.75x cheaper than Sonnet and 18.75x cheaper than Opus on a per-token basis.
How do I switch models in Claude Code?
Use the `/model` command followed by the model name. Type `/model haiku` to switch to Haiku, `/model sonnet` for Sonnet, or `/model opus` for Opus. The switch takes effect immediately for subsequent messages. You can switch multiple times within a single session without losing context.
What tasks should I NOT use Haiku for?
Avoid Haiku for complex multi-file refactors (5+ files with interdependencies), subtle bugs like race conditions or state machine errors, novel architectural patterns that don't follow existing codebase conventions, and large-scale code generation requiring structural coherence. These tasks benefit from Sonnet or Opus's stronger reasoning capabilities.
What is the optimal model split for minimizing Claude Code costs?
The optimal split is approximately 80% Haiku, 15% Sonnet, and 5% Opus. This delivers 95%+ of output quality at roughly $1/day without a context engine, or $0.42/day with one. The key enabler is a context engine that makes Haiku reliable for routine tasks by providing pre-computed, dependency-ranked context instead of requiring Haiku to explore and infer on its own.

Nicola

Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.

Related Articles