Claude Code Has No Session Memory — Here's How to Add It

Nicola·
Claude Code Has No Session Memory — Here's How to Add It

Claude Code Has No Session Memory — Here's How to Add It

Start a Claude Code session. Write code, make decisions, learn things about the codebase. End the session.

Start a new session the next day. Claude Code has no memory of what happened. Zero.

It will re-read the same files. Re-discover the same patterns. Re-learn the same conventions. You'll pay tokens for all of it.

This is a known limitation — and there's a practical fix.

What "No Session Memory" Actually Means

Claude Code is stateless between sessions. Each conversation is entirely fresh. This is by design: it's how LLM-based tools work. The model doesn't persist knowledge between API calls.

Within a single session, Claude Code does remember — that's just the conversation context. But once you close the session, that context is gone.

This matters more than it might seem. In a typical development day:

  • You discover that the UserService class has a subtle threading issue
  • You work around it, make a note, continue
  • Tomorrow, you open a new session and Claude Code reads UserService fresh. It doesn't know about the threading issue. It might suggest the same problematic pattern again.
  • You spend tokens re-explaining something you already explained yesterday.

Multiply this by all the architectural decisions, conventions, and discovered constraints in a production codebase. The cognitive overhead is real.

The CLAUDE.md Partial Solution

The simplest approach is the CLAUDE.md file. Claude Code reads this file at the start of every session, giving you a way to inject persistent context.

Useful things to put in CLAUDE.md:

```markdown

Architecture Notes

  • Auth uses custom JWT (not PyJWT) — see auth/jwt_custom.py
  • Inventory updates use optimistic locking due to race condition history
  • UserService has a threading issue with the connection pool — never call .refresh() in a background task

Conventions

  • All new endpoints must include rate limiting via @rate_limit decorator
  • Error responses use the format in api/errors.py, never return raw exceptions
  • Tests live in tests/unit/ for unit tests, tests/integration/ for integration

```

This works, but it has limits:

  • It's static — you have to manually update it
  • It's global — all context loads every session regardless of relevance
  • It doesn't scale — once it gets long, it adds tokens to every session whether needed or not
  • It doesn't connect context to specific code symbols, so it can become stale silently

For small teams and simple projects, CLAUDE.md is often enough. For larger teams or complex codebases, you need something more dynamic.

The Better Approach: Graph-Linked Session Memory

The second approach is a dedicated session memory system. This is what vexp implements.

The key difference from a static CLAUDE.md: observations are linked to code symbols. When vexp's run_pipeline returns context for a task, it queries the memory store and surfaces only the observations that are relevant to the specific code being worked on.

How it works in practice:

  1. You save an observation during a session: "The UserService connection pool has a threading issue — never call .refresh() in background tasks"
  2. vexp links this observation to the UserService class symbol in its code graph
  3. Next session, when you work on anything touching UserService, this observation automatically surfaces in the context capsule
  4. The observation doesn't appear when you're working on unrelated code — it only loads when relevant

This is fundamentally different from a static CLAUDE.md because the observations are selective: you only pay tokens for the memory that's relevant to the current task.

Staleness Detection: Memory That Knows When It's Wrong

Here's the thing about documentation and notes: they go stale. Code changes; static notes don't update automatically.

vexp handles this through its graph-linked staleness detection:

  • Each observation is linked to specific code symbols (functions, classes, methods)
  • When those symbols change (tracked via the dependency graph), the linked observations are automatically flagged as potentially stale
  • The observation still surfaces but with a staleness warning, so you know to re-verify it

This means memory degrades gracefully instead of silently — you get notified when the code an observation was about has changed, rather than trusting outdated context.

For the full picture of how session memory works and the staleness detection model, see Session Memory for AI Coding Agents: Why Your Agent Forgets.

Setting Up Session Memory in Claude Code

With vexp as an MCP server, session memory is available via two tools:

Saving observations:

```ts

save_observation({

content: "UserService connection pool threading issue — never call .refresh() in background tasks",

linked_symbols: ["UserService", "UserService.refresh"],

type: "insight"

Frequently Asked Questions

Does Claude Code remember anything between sessions?
No. Claude Code has no built-in cross-session memory. Each new session starts completely fresh with zero knowledge of previous sessions. The only persistent mechanism natively supported is CLAUDE.md, which lets you pre-load static text instructions at session start, but this is manually maintained and not linked to code changes.
How can I add session memory to Claude Code?
The most powerful way is via vexp's MCP server, which provides graph-linked session memory that persists across sessions, is searchable semantically, and automatically detects stale observations. A simpler DIY approach is maintaining a memory/ directory with markdown files and instructing Claude to read and update them during sessions, though this lacks staleness detection and semantic search.
What is the difference between CLAUDE.md and proper session memory?
CLAUDE.md is a static, manually-maintained file that loads every session. It's good for stable project conventions but doesn't capture dynamic observations, can't detect when its own content becomes outdated, and doesn't support search across multiple sessions. Proper session memory (like vexp's) is automatically captured, linked to code symbols, staleness-aware, and semantically searchable across unlimited historical sessions.
How does vexp's session memory differ from manual memory files?
vexp's session memory is automatically populated from every tool call the agent makes — no manual effort required. Observations are linked to specific code graph nodes, so when that code changes, the linked memories are flagged as stale. The search is semantic rather than keyword-based, so 'auth module' retrieves memories about authentication even if they don't contain the exact phrase. Manual files support none of this.
Can session memory improve code quality, not just reduce tokens?
Yes, significantly. With session memory, the agent retains architectural decisions (why a particular pattern was chosen), past bug investigations (what was tried and why it didn't work), and project-specific conventions (how errors are handled, what testing patterns are expected). This leads to more consistent, idiomatic code and fewer cases where the agent suggests something that contradicts established project patterns.

Nicola

Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.

Related Articles