Claude Code Has No Session Memory — Here's How to Add It

Claude Code Has No Session Memory — Here's How to Add It
Start a Claude Code session. Write code, make decisions, learn things about the codebase. End the session.
Start a new session the next day. Claude Code has no memory of what happened. Zero.
It will re-read the same files. Re-discover the same patterns. Re-learn the same conventions. You'll pay tokens for all of it.
This is a known limitation — and there's a practical fix.
What "No Session Memory" Actually Means
Claude Code is stateless between sessions. Each conversation is entirely fresh. This is by design: it's how LLM-based tools work. The model doesn't persist knowledge between API calls.
Within a single session, Claude Code does remember — that's just the conversation context. But once you close the session, that context is gone.
This matters more than it might seem. In a typical development day:
- You discover that the
UserServiceclass has a subtle threading issue - You work around it, make a note, continue
- Tomorrow, you open a new session and Claude Code reads
UserServicefresh. It doesn't know about the threading issue. It might suggest the same problematic pattern again. - You spend tokens re-explaining something you already explained yesterday.
Multiply this by all the architectural decisions, conventions, and discovered constraints in a production codebase. The cognitive overhead is real.
The CLAUDE.md Partial Solution
The simplest approach is the CLAUDE.md file. Claude Code reads this file at the start of every session, giving you a way to inject persistent context.
Useful things to put in CLAUDE.md:
```markdown
Architecture Notes
- Auth uses custom JWT (not PyJWT) — see auth/jwt_custom.py
- Inventory updates use optimistic locking due to race condition history
- UserService has a threading issue with the connection pool — never call .refresh() in a background task
Conventions
- All new endpoints must include rate limiting via @rate_limit decorator
- Error responses use the format in api/errors.py, never return raw exceptions
- Tests live in tests/unit/ for unit tests, tests/integration/ for integration
```
This works, but it has limits:
- It's static — you have to manually update it
- It's global — all context loads every session regardless of relevance
- It doesn't scale — once it gets long, it adds tokens to every session whether needed or not
- It doesn't connect context to specific code symbols, so it can become stale silently
For small teams and simple projects, CLAUDE.md is often enough. For larger teams or complex codebases, you need something more dynamic.
The Better Approach: Graph-Linked Session Memory
The second approach is a dedicated session memory system. This is what vexp implements.
The key difference from a static CLAUDE.md: observations are linked to code symbols. When vexp's run_pipeline returns context for a task, it queries the memory store and surfaces only the observations that are relevant to the specific code being worked on.
How it works in practice:
- You save an observation during a session: "The UserService connection pool has a threading issue — never call .refresh() in background tasks"
- vexp links this observation to the
UserServiceclass symbol in its code graph - Next session, when you work on anything touching
UserService, this observation automatically surfaces in the context capsule - The observation doesn't appear when you're working on unrelated code — it only loads when relevant
This is fundamentally different from a static CLAUDE.md because the observations are selective: you only pay tokens for the memory that's relevant to the current task.
Staleness Detection: Memory That Knows When It's Wrong
Here's the thing about documentation and notes: they go stale. Code changes; static notes don't update automatically.
vexp handles this through its graph-linked staleness detection:
- Each observation is linked to specific code symbols (functions, classes, methods)
- When those symbols change (tracked via the dependency graph), the linked observations are automatically flagged as potentially stale
- The observation still surfaces but with a staleness warning, so you know to re-verify it
This means memory degrades gracefully instead of silently — you get notified when the code an observation was about has changed, rather than trusting outdated context.
For the full picture of how session memory works and the staleness detection model, see Session Memory for AI Coding Agents: Why Your Agent Forgets.
Setting Up Session Memory in Claude Code
With vexp as an MCP server, session memory is available via two tools:
Saving observations:
```ts
save_observation({
content: "UserService connection pool threading issue — never call .refresh() in background tasks",
linked_symbols: ["UserService", "UserService.refresh"],
type: "insight"
Frequently Asked Questions
Does Claude Code remember anything between sessions?
How can I add session memory to Claude Code?
What is the difference between CLAUDE.md and proper session memory?
How does vexp's session memory differ from manual memory files?
Can session memory improve code quality, not just reduce tokens?
Nicola
Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.
Related Articles

Context Window Management for AI Coding: The Developer's Guide
Learn how AI context windows work, why long coding sessions degrade, and practical strategies and tools like vexp to keep Claude effective and costs low.

Cursor vs Claude Code vs Copilot 2026: The Only Comparison You Need
A practical 2026 comparison of GitHub Copilot, Cursor, and Claude Code based on real production use, with a focus on context, agentic workflows, and pricing.

How to Reduce Claude Code Token Usage by 58% (Without Manual Context Management)
Use a dependency-graph MCP server (vexp) to feed Claude Code only structurally relevant context and cut token costs by ~58%—no prompt changes required.