Local-first context engine for AI coding agents

Your codebase.
Understood.

vexp builds a dependency graph of your codebase and serves only what matters to your AI agent — running entirely on your machine. No cloud. No account. No code leaving your laptop.

Works with
Claude Code
Cursor
Windsurf
Copilot
Continue.dev
Zed
Augment
Codex
Opencode
Kilo Code
Kiro
Antigravity
The problem

AI agents read everything. They understand nothing.

Every coding session starts the same: the agent scans files, guesses dependencies, and fills your context window with code it will never use. Cloud-based context engines solve this — but send your code to their servers.

Without vexp
8,247tokens
import { authenticate } from "./auth"
import { RateLimiter } from "./limiter"
import { db } from "./database"
import { logger } from "./logger"
import { config } from "./config"
import { User } from "./types/user"
import { Session } from "./types/session"
import { CacheService } from "./cache"
export async function middleware(req, res, next) {
const token = req.headers.authorization
if (!token) return res.status(401).json({ error: "Unauthorized" })
try {
const user = await authenticate(token)
req.user = user
logger.info(`Request from ${user.id}`)
next()
} catch (err) {
logger.error("Auth failed:", err)
res.status(401).json({ error: "Invalid token" })
}
}
 
export class RateLimiter {
private redis: Redis
private limits: Map<string, number>
 
constructor(config: RateLimitConfig) {
this.redis = new Redis(config.redisUrl)
this.limits = new Map(Object.entries(config.limits))
}
 
async check(key: string): Promise<boolean> {
const count = await this.redis.incr(key)
const limit = this.limits.get(key) ?? 100
return count <= limit
}
}
import { authenticate } from "./auth"
import { RateLimiter } from "./limiter"
import { db } from "./database"
import { logger } from "./logger"
import { config } from "./config"
import { User } from "./types/user"
import { Session } from "./types/session"
import { CacheService } from "./cache"
export async function middleware(req, res, next) {
const token = req.headers.authorization
if (!token) return res.status(401).json({ error: "Unauthorized" })
try {
const user = await authenticate(token)
req.user = user
logger.info(`Request from ${user.id}`)
next()
} catch (err) {
logger.error("Auth failed:", err)
res.status(401).json({ error: "Invalid token" })
}
}
 
export class RateLimiter {
private redis: Redis
private limits: Map<string, number>
 
constructor(config: RateLimitConfig) {
this.redis = new Redis(config.redisUrl)
this.limits = new Map(Object.entries(config.limits))
}
 
async check(key: string): Promise<boolean> {
const count = await this.redis.incr(key)
const limit = this.limits.get(key) ?? 100
return count <= limit
}
}
context budget81% used
With vexp
2,140tokens
● pivotmiddleware/auth.ts
export async function authenticate(token: string): Promise<User>
export async function middleware(req, res, next): Promise<void>
○ skeletonservices/cache.ts
class RateLimiter
check(key: string): Promise<boolean>
reset(key: string): Promise<void>
○ skeletonconfig/limits.ts
export const rateLimits: Record<string, number>
● pivotmiddleware/auth.ts
export async function authenticate(token: string): Promise<User>
export async function middleware(req, res, next): Promise<void>
○ skeletonservices/cache.ts
class RateLimiter
check(key: string): Promise<boolean>
○ skeletonconfig/limits.ts
export const rateLimits: Record<string, number>
context budget21% used

That 74% token reduction is not an optimization. It is a structural repair — and it happens entirely on your machine.

How it works

Graph-native context. Deterministic. Local. Zero network calls.

01

Index

Parse. Graph. Persist.

vexp uses tree-sitter to parse your codebase into an AST, then builds a dependency graph: nodes for functions, classes, and types; edges for calls, imports, and implementations. Stored in local SQLite — never uploaded, never shared. 5,000 files indexed in under 15 seconds.

.vexp/index.db — 34.8k nodes | 89.2k edges
< 15sfull index
02

Traverse

Hybrid search + graph centrality.

When an agent queries a task, vexp runs hybrid search — FTS5 full-text matching combined with TF-IDF semantic similarity — to find candidate pivot nodes, then ranks them using graph centrality. Intent detection auto-selects the search strategy: 'fix bug' triggers debug mode, 'refactor' triggers blast-radius mode. No embeddings. No external API. No hallucination.

FTS5 + TF-IDF → 423 candidates → intent: debug → centrality rank → top 12 pivots
< 500msP95 query
03

Capsule

Pivots in full. Scaffolding skeletonized.

Pivot nodes are returned with full source. Adjacent nodes are reduced to signatures, docstrings, and return types — no implementation bodies. The capsule is bounded to your token budget. Exact context, nothing more.

pivot: 350 lines → skeleton: 8 lines (97.7% reduction)
70-90%skeleton reduction
Architecture
AI Agents
Claude Code
Cursor
Windsurf
Copilot
Continue.dev
+ 7 more
MCP Protocolstdio / HTTP
vexp-mcp
TypeScript · Node.js
7 MCP tools
Tool schema validation
Session multiplexing
Bundled in VSIX
Unix socket/ Named pipe
vexp-core
Rust daemon
Indexer (tree-sitter)
Graph engine (petgraph)
Skeletonizer (AST)
SQLite persistence
File watcher (notify)
.vexp/index.db
~/.vexp/registry.db
Smart Features
v1.1.0
Intent Detection

Auto-detects query intent from your prompt. "fix bug" activates debug mode (follows error paths), "refactor" triggers blast-radius mode, "add feature" uses modify mode.

Hybrid Search

Combines FTS5 keyword matching with TF-IDF semantic similarity and graph centrality. Finds validateCredentials when you search for "authentication" — no embeddings required.

LSP Bridge

VS Code captures type-resolved call edges from the language server for high-confidence call graphs. Supplements tree-sitter static analysis with runtime type information.

Feedback Loop

Repeated queries with similar terms automatically expand the result budget. The engine learns your session focus and returns progressively broader context.

Interactive demo

See it work.

Three scenarios. Real MCP tool calls. Token savings calculated.

MCP Tool Call
// Claude Code → vexp
tool: "get_context_capsule"
query: "Add rate limiting to auth middleware"
max_tokens: 8000
vexp Output
Run a query to see the output
MCP Tools

Seven tools. Every context problem solved.

Multi-repo & Git

Cross-repo context. Git-native index.

Most AI agents are repo-blind. Cloud context engines can link repos — but require uploading your code. vexp builds cross-repo dependency graphs entirely on your machine, spanning frontend, backend, and infra in a single local query.

1 query
spans all repos
auto-detected
API contracts & shared types
zero config
cross-repo linking
Cross-repo graph
frontendTypeScript
consumer
backendGo
provider
infraTerraform
infrastructure
openapi contract
shared types
env vars
Single query across all repos:
get_context_capsule("Add JWT refresh to auth")
→ 3 repos · 8 pivots · 2,840 tokens
Cross-repo advantage
You ask: "Add JWT refresh to the auth endpoint"
vexp returns: auth handler (backend) + token type (shared) + refresh logic stub (frontend) — one query, three repos
3 repos · 1 query
Agent reads backend auth.go blindly, misses shared Token type
vexp detects the OpenAPI contract edge, includes the TypeScript Token type skeleton automatically
Zero missed dependencies
"What environment variables does the frontend expect from infra?"
vexp traces ENV_CONTRACT edges: infra .env.example → backend config → frontend process.env refs
Full env lineage
Git-native index
$ git clone git@github.com/org/repo.git
Receiving objects: 100% (12,847 objects)
vexp: 34,821 nodes ready (synced · commit a3f8c91)
4.8s total — vs 62s fresh index
Compatibility

Already in your workflow.

Install the VS Code extension. vexp auto-detects your agents, writes their config files, and starts working. No CLI. No login. No API key.

Claude Code
Cursor
Windsurf
GitHub Copilot
Continue.dev
Augment
Zed
Codex
Opencode
Kilo Code
Kiro
Antigravity

Distributed as a VS Code extension · works with any MCP agent · generates configs for 12 agents automatically

Performance

The numbers.

65-70%
Average token reduction
Measured across 5 real codebases
< 15s
Index 5,000 files from scratch
Parallel tree-sitter parsing with rayon
< 500ms
P95 context capsule query
Hybrid search: FTS + TF-IDF + graph centrality
0
Network calls to external servers
Native Rust binary. Local SQLite. Your code never leaves your machine.

Benchmarked against Next.js, FastAPI, Gin, and vexp itself

Pricing

Flat pricing. No credits. No surprises.

Starter lets you try vexp on any project — no account, no API key. Pro unlocks all tools, multi-repo, and 50,000 nodes for $19/mo.

Starter
$0forever

Try vexp on a personal project. No account required.

  • ≤ 2,000 nodes
  • 1 repository
  • 3 core MCP tools
  • Git index persistence
  • Community support
Most popular
Pro
$19/month

Full context engine for professional developers. Under $20 — expense without approval.

  • 50,000 nodes
  • 3 repositories
  • All 7 MCP tools
  • Multi-repo workspace
  • Intent detection & CodeLens
  • Email support
Get started

Install. Open. Done.

01

Install the extension

VS Code / Cursor / Windsurf
# Search in the Extensions panel
Extensions → Search "vexp" → Install
✓ Auto-detects your AI agent and configures MCP
02

Open your workspace — vexp indexes automatically

Status Bar
vexp: indexing... 2,341/5,000 files
vexp: 34.8k nodes | 89.2k edges | ready
03

Your agent has context — no account needed

Claude Code
# Auto-called before every code edit
tool_call: get_context_capsule
query: "{task}"
✓ 12 pivots · 8 skeletons · 2,140 tokens

Context that never leaves
your machine.

Native Rust binary. Local SQLite. Zero network calls. Works with 12 agents out of the box.