Using Claude Code with FastAPI: Benchmark-Proven Token Optimization

Using Claude Code with FastAPI: Benchmark-Proven Token Optimization
FastAPI is an ideal stress test for AI coding assistants. It combines rich type annotations, Pydantic models, dependency injection, async I/O, and ORM integration—enough structure to be machine-readable, but enough complexity that naïve context loading quickly becomes expensive.
We benchmarked Claude Code on a real-world FastAPI e‑commerce backend and measured how a graph-aware context engine (vexp) changes both cost and quality. This post walks through the benchmark, the results, and how to apply the same setup to your own FastAPI projects.
FastAPI + vexp Benchmark Summary
Your FastAPI benchmark shows that dependency-graph–driven context selection (via vexp) materially improves AI coding performance on a realistic, typed Python codebase.
Key Setup
- Codebase: FastAPI framework + representative app layer
- Tasks: 7 realistic dev tasks (bugs, features, refactors, docs/tests)
- Runs: 21 per arm (baseline vs. vexp)
- Model: Claude 3.5 Sonnet (API)
- Agent: Claude Code with MCP
Tasks Covered
- Fix validation bug in request body handling
- Add rate limiting middleware to an existing route
- Refactor dependency injection in the auth module
- Add a new endpoint with proper error handling
- Update tests for a modified response schema
- Add OpenAPI docs to undocumented endpoints
- Diagnose a DB-layer performance issue
Each task required coherent edits across ~5–15 files.
Aggregate Results
| Metric | Baseline | With vexp | Change |
|--------|----------|-----------|--------|
| Input tokens (avg) | 84,200 | 29,500 | -65% |
| Total API cost (avg) | $0.38 | $0.16 | -58% |
| Task completion time | 4.2 min | 3.3 min | -22% |
| Task completion rate | 71% | 85% | +14pp |
All 7 tasks and all categories improved on every metric; per-task token reduction ranged ~45–70%.
Why FastAPI Is a Strong Stress Test
FastAPI exposes weaknesses in naive context selection:
- Dependency injection chains (
Depends()): - Long chains:
get_current_user → get_token → get_db → db session config. - Naive agents either miss links or over-include unrelated dependencies.
- Decorator-based routing:
- Routes via
@router.get(...)etc. aren’t discovered by simple import following. - Agents that only track imports miss router–handler connections.
- Pydantic model proliferation:
- Distinct types like
UserResponse,UserCreate,UserUpdate. - Conflation or omission leads to incorrect code.
- Test/source separation:
- Tests in a separate tree.
- Naive context grabbers often pull in tests instead of focusing on source + models + deps.
These patterns create token-wasting traps that the benchmark makes visible.
The Context Problem (Task 3 Example)
Task: Refactor dependency injection in auth.
Naive flow
- Start at
routers/users.py. - Follow
Depends(get_db),Depends(get_current_user). - Read
dependencies/database.py,dependencies/auth.py. - From
auth.py, follow tosecurity.py,models/user.py,schemas/user.py. - From schemas, follow to
models/base.py. - Read adjacent
/routers/files. - Read
/tests/routers/because they’re nearby.
Outcome: ~40+ files, ~84k tokens, when ~8 files are truly needed.
With vexp dependency graph
run_pipelineseeds graph atrouters/users.py.- Traverses imports, relevant dependencies, related schemas.
- Ranks by centrality:
UserService,UserSchema,DatabaseSessionhigh; tests low. - Returns a capsule of ~8 files (~29k tokens).
Same task, same model, ~65% fewer tokens and higher completion rate.
How to Reproduce This in Your FastAPI Project
1. Install vexp CLI
Frequently Asked Questions
Why does Claude Code use so many tokens on FastAPI projects?
How does vexp optimize context for Python and FastAPI codebases?
Can I use vexp with Claude Code on a FastAPI monorepo?
What FastAPI-specific patterns waste the most tokens?
How do I benchmark token savings for my FastAPI project?
Nicola
Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.
Related Articles

Codex vs Claude: AI Coding Agents Compared 2026
Compare OpenAI Codex and Claude Code: cloud-sandboxed vs local-shell execution, security, token optimization, and which fits your workflow.

Claude vs Codex 2026: Which AI Coding Agent Wins?
Compare Claude Code vs OpenAI Codex for AI coding tasks. Local vs cloud execution, costs, security, and workflow fit explained.

Claude Code vs Codex: Which AI Coding Agent Wins in 2026?
Compare Claude Code vs Codex: benchmark scores, architecture, pricing, and which agentic coding tool fits your workflow best.