Claude Code Has No Session Memory — Here's How to Add It

Claude Code Has No Session Memory — Here's How to Add It
Start a Claude Code session. Write code, make decisions, learn things about the codebase. End the session.
Start a new session the next day. Claude Code has no memory of what happened. Zero.
It will re-read the same files. Re-discover the same patterns. Re-learn the same conventions. You'll pay tokens for all of it.
This is a known limitation — and there's a practical fix.
What "No Session Memory" Actually Means
Claude Code is stateless between sessions. Each conversation is entirely fresh. This is by design: it's how LLM-based tools work. The model doesn't persist knowledge between API calls.
Within a single session, Claude Code does remember — that's just the conversation context. But once you close the session, that context is gone.
This matters more than it might seem. In a typical development day:
- You discover that the
UserServiceclass has a subtle threading issue - You work around it, make a note, continue
- Tomorrow, you open a new session and Claude Code reads
UserServicefresh. It doesn't know about the threading issue. It might suggest the same problematic pattern again. - You spend tokens re-explaining something you already explained yesterday.
Multiply this by all the architectural decisions, conventions, and discovered constraints in a production codebase. The cognitive overhead is real.
The CLAUDE.md Partial Solution
The simplest approach is the CLAUDE.md file. Claude Code reads this file at the start of every session, giving you a way to inject persistent context.
Useful things to put in CLAUDE.md:
```markdown
Architecture Notes
- Auth uses custom JWT (not PyJWT) — see auth/jwt_custom.py
- Inventory updates use optimistic locking due to race condition history
- UserService has a threading issue with the connection pool — never call .refresh() in a background task
Conventions
- All new endpoints must include rate limiting via @rate_limit decorator
- Error responses use the format in api/errors.py, never return raw exceptions
- Tests live in tests/unit/ for unit tests, tests/integration/ for integration
```
This works, but it has limits:
- It's static — you have to manually update it
- It's global — all context loads every session regardless of relevance
- It doesn't scale — once it gets long, it adds tokens to every session whether needed or not
- It doesn't connect context to specific code symbols, so it can become stale silently
For small teams and simple projects, CLAUDE.md is often enough. For larger teams or complex codebases, you need something more dynamic.
The Better Approach: Graph-Linked Session Memory
The second approach is a dedicated session memory system. This is what vexp implements.
The key difference from a static CLAUDE.md: observations are linked to code symbols. When vexp's run_pipeline returns context for a task, it queries the memory store and surfaces only the observations that are relevant to the specific code being worked on.
How it works in practice:
- You save an observation during a session: "The UserService connection pool has a threading issue — never call .refresh() in background tasks"
- vexp links this observation to the
UserServiceclass symbol in its code graph - Next session, when you work on anything touching
UserService, this observation automatically surfaces in the context capsule - The observation doesn't appear when you're working on unrelated code — it only loads when relevant
This is fundamentally different from a static CLAUDE.md because the observations are selective: you only pay tokens for the memory that's relevant to the current task.
Staleness Detection: Memory That Knows When It's Wrong
Here's the thing about documentation and notes: they go stale. Code changes; static notes don't update automatically.
vexp handles this through its graph-linked staleness detection:
- Each observation is linked to specific code symbols (functions, classes, methods)
- When those symbols change (tracked via the dependency graph), the linked observations are automatically flagged as potentially stale
- The observation still surfaces but with a staleness warning, so you know to re-verify it
This means memory degrades gracefully instead of silently — you get notified when the code an observation was about has changed, rather than trusting outdated context.
For the full picture of how session memory works and the staleness detection model, see Session Memory for AI Coding Agents: Why Your Agent Forgets.
Setting Up Session Memory in Claude Code
With vexp as an MCP server, session memory is available via two tools:
Saving observations:
```ts
save_observation({
content: "UserService connection pool threading issue — never call .refresh() in background tasks",
linked_symbols: ["UserService", "UserService.refresh"],
type: "insight"
Frequently Asked Questions
Does Claude Code remember anything between sessions?
How can I add session memory to Claude Code?
What is the difference between CLAUDE.md and proper session memory?
How does vexp's session memory differ from manual memory files?
Can session memory improve code quality, not just reduce tokens?
Nicola
Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.
Related Articles

Vibe Coding Is Fun Until the Bill Arrives: Token Optimization Guide
Vibe coding with AI is addictive but expensive. Freestyle prompting without context management burns tokens 3-5x faster than structured workflows.

Windsurf Credits Running Out? How to Use Fewer Tokens Per Task
Windsurf credits deplete fast because the AI processes too much irrelevant context. Reduce what it needs to read and your credits last 2-3x longer.

Best AI Coding Tool for Startups: Balancing Cost, Speed, and Quality
Startups need speed and budget control. The ideal AI coding stack combines a free/cheap agent with context optimization — here's how to set it up.