Building a Memory Bank for Claude Code (That Survives Session Resets)

Nicola·
Building a Memory Bank for Claude Code (That Survives Session Resets)

Building a Memory Bank for Claude Code (That Survives Session Resets)

Claude Code starts every new session with a blank slate: no memory of last week's architecture decisions, no recall of that tricky auth edge case, no awareness of your team's preferred patterns.

This session reset problem typically costs 5–10 minutes of context-priming at the start of every session. Multiplied across a team running 10+ sessions a day, that's significant wasted time.

You can fix this by building a persistent memory bank. This guide covers three layers of memory for Claude Code and how to combine them into a system that actually works.

Why Sessions Reset: The Architecture

Claude Code doesn't run as a persistent background process. Each session is a fresh API conversation with its own context window. When the session ends, everything in that window disappears: files you opened, decisions you made, patterns you established.

This stateless architecture makes behavior predictable and debuggable — but it means any notion of memory has to be explicitly stored somewhere you can reload it later.

There are three practical layers for doing this:

  • Static context: things that rarely change (CLAUDE.md)
  • Dynamic context: things that change per-task or per-sprint (memory files)
  • Automatic context: things discovered by indexing your codebase (vexp session memory)

Each layer serves a different purpose. The best setups use all three.

Layer 1: CLAUDE.md — Your Static Memory Bank

CLAUDE.md is a file that Claude Code automatically loads at the start of every session. Think of it as a briefing document Claude reads before doing anything else.

Anything you put here is available immediately, without re-explaining it every session.

What to Put in CLAUDE.md

Keep it focused on high-value, stable context. A well-structured CLAUDE.md has four sections:

Project overview (2–4 sentences)

Explain what the project is, the stack it uses, and where the main code lives. This is the single most valuable section because Claude can't infer project purpose from code alone.

Architecture decisions — document the why, not just the what. Example entries:

  • We use CQRS pattern: read models are in src/queries/, write models in src/commands/.
  • Repository pattern for all database access — never query the DB directly from API handlers.
  • All API endpoints must validate with Pydantic v2 models in src/schemas/.
  • Async everywhere: all IO operations use async/await.

Code conventions

Document formatting tools (Black, Prettier), test frameworks (pytest, Vitest), commit message style, and other conventions that don't change often but affect every suggestion Claude makes.

Things not to re-derive

Call out important facts Claude shouldn't waste time rediscovering:

  • The auth middleware is in src/middleware/auth.py — don't re-implement it.
  • All dates are UTC.
  • The test database is separate (TEST_DATABASE_URL env var).

What Not to Put in CLAUDE.md

  • File contents or code snippets — Claude can read files directly; pasting them wastes tokens.
  • Complete API documentation — too long; link to a docs file instead.
  • Information that changes frequently — put that in dynamic memory instead.

Target length: 300–600 words. Short enough to load fast, complete enough to be useful.

CLAUDE.md Placement

Claude Code looks for CLAUDE.md files in three places:

  • ~/.claude/CLAUDE.md — global, applies to all projects.
  • .claude/CLAUDE.md — project-level, applies to this project.
  • Subdirectory CLAUDE.md files — applies when working in that subdirectory.

This hierarchy lets you have global preferences (your communication style, formatting preferences) and project-specific context (architecture, conventions) separately.

Layer 2: Dynamic Memory Files

Some context changes too frequently for CLAUDE.md (which you want to keep stable), but still needs to persist across sessions.

The Memory Directory Pattern

Create a .claude/memory/ directory in your project with these files:

  • MEMORY.md — main index with recent notable context.
  • decisions.md — running log of architecture decisions.
  • patterns.md — discovered patterns and solutions.
  • debugging.md — recurring bug patterns and fixes.
  • sprint.md — current sprint context (updated weekly).

In your CLAUDE.md, reference these files:

For current sprint context and recent decisions, read .claude/memory/MEMORY.md and .claude/memory/decisions.md at session start.

The Decisions Log

The most valuable memory file is decisions.md, which captures the why behind architecture changes. A good entry looks like:

2026-02-15: Switched from SQLAlchemy to asyncpg — SQLAlchemy async support was causing connection pool issues under load. asyncpg is significantly faster for our read-heavy workload. All queries now in src/db/queries/.

2026-02-28: Added Redis caching for user sessions — Session lookups were hitting DB on every request. Redis TTL set to 24h (same as JWT expiry). Cache invalidation in src/services/auth.py:invalidate_user_session().

When Claude reads this before your next session, it understands your architecture's current state — not just the code, but the reasoning behind it.

The Patterns File

patterns.md captures solutions to recurring problems:

Async pagination pattern

Always use cursor-based pagination (not offset) for consistent performance:

```sql

WHERE id > $1

ORDER BY id

LIMIT $2

```

Error handling in API handlers

Always wrap service calls in try/except and map domain exceptions to HTTP status codes. See src/api/utils/errors.py for the mapping.

Updating Memory Files

The key habit: after every session where you discover something useful, update the relevant memory file before closing. This takes 2–3 minutes and saves 5–10 minutes at the next session start.

You can even ask Claude:

Before we end, write a summary of the architectural decisions we made today to .claude/memory/decisions.md.

Layer 3: Automated Session Memory with vexp

The first two layers require manual maintenance. They work, but they're only as good as your discipline in updating them.

vexp adds a third layer: automatic session memory tied to your codebase graph.

How vexp Session Memory Works

When you use vexp tools during a session, it automatically captures observations — what code you explored, what decisions were made, what patterns were found. These observations are:

  • Linked to code symbols (function names, class names) — so if the linked code changes, the observation is flagged as potentially stale.
  • Searchable across sessions — search_memory("authentication bug") finds relevant past observations.
  • Surfaced automatically — when you run run_pipeline on a task, related memories from previous sessions are included in the results.

This means context from past sessions flows into new sessions automatically, without manual effort.

Manual Observations

You can also save observations explicitly:

```bash

save_observation("Switched to asyncpg — all queries now in src/db/queries/")

```

Or using the observation parameter in run_pipeline:

```bash

run_pipeline({ task: "fix session caching bug", observation: "Redis TTL mismatch — JWT is 24h but Redis was set to 12h" })

```

The Staleness Signal

When code linked to a memory observation changes, vexp flags the observation as potentially stale. If you saved "auth validation is in src/middleware/auth.py" and then that file was refactored, the observation gets flagged. You'll see it in results but with a staleness indicator — a useful signal that your memory bank needs updating.

Putting It Together: The Complete Setup

Here is the full setup, in priority order.

Step 1 — Create global CLAUDE.md

Create ~/.claude/CLAUDE.md with your preferred communication style, global formatting preferences, and tools you use across all projects. This takes ~15 minutes once and benefits every project immediately.

Step 2 — Create project CLAUDE.md

Create .claude/CLAUDE.md with project overview, tech stack, architecture decisions, and code conventions. Aim for 300–600 words. This is the highest-ROI investment for any specific project.

Step 3 — Create memory directory

Create .claude/memory/ and start with MEMORY.md and decisions.md. Add patterns.md and sprint.md as the project grows. Commit these files to the repo so the whole team benefits.

Step 4 — Install vexp for automated session memory

Install and index your workspace:

```bash

npm install -g vexp-cli

vexp-core index

```

Then add vexp to your Claude Code MCP config:

```json

Frequently Asked Questions

Does the memory bank survive Claude Code updates or reinstalls?

Yes. Everything in the three-layer stack lives in plain text files — your global CLAUDE.md, project CLAUDE.md, and memory directory. These survive Claude Code updates, reinstalls, and machine changes. The vexp index is local-only (index.db is gitignored), but observations saved with save_observation persist in vexp's own database independently.

How much overhead does MEMORY.md add to each session?

A 200-line MEMORY.md adds roughly 1,500–2,000 tokens per session start. That sounds costly, but compare it to spending 3,000–5,000 tokens re-explaining architecture the AI already worked through in a previous session. The memory bank pays for itself in session two.

Can I use this approach with Cursor, Windsurf, or other AI coding agents?

The CLAUDE.md pattern is Claude Code-specific for auto-loading, but the concept works in any agent that supports project configuration files. The vexp session memory layer works across all 12 supported agents — Claude Code, Cursor, Windsurf, GitHub Copilot, Continue.dev, Augment, Zed, Codex, Opencode, Kilo Code, Kiro, and Antigravity.

What if MEMORY.md grows too large?

Claude Code loads MEMORY.md at session start but truncates content after 200 lines. Keep it to a concise index of key facts and link to separate topic files (e.g., memory/authentication.md, memory/api-patterns.md) for detail. The index-plus-detail pattern keeps the primary file within limits while preserving all context.

Does vexp session memory work offline?

Yes. The vexp daemon (vexp-core) runs locally and observations are stored in a local database. The only network calls are initial activation and optional telemetry (which can be disabled). Once installed and indexed, session memory works entirely offline.

Frequently Asked Questions

Does Claude Code remember anything between sessions?
No. Claude Code starts every new session with a blank slate — no memory of previous architecture decisions, edge cases, or team patterns. This session reset typically costs 5–10 minutes of context-priming at the start of every session, which adds up significantly across a team.
What is a memory bank for Claude Code?
A memory bank is a persistent store of observations, decisions, and code insights that survives between Claude Code sessions. It typically involves three layers: static project context (like CLAUDE.md), manual memory files maintained by the developer, and automated session memory that captures and retrieves relevant context automatically.
How does CLAUDE.md work as a memory layer?
CLAUDE.md is a static file loaded automatically at the start of every Claude Code session. It provides project-level context like architecture decisions, coding conventions, and key file paths. However, it's manually maintained, doesn't adapt to what you're currently working on, and can become stale if not regularly updated.
What is automated session memory for AI coding agents?
Automated session memory captures observations, decisions, and insights during coding sessions and makes them available in future sessions. Unlike manual memory files, it auto-captures context, links observations to code symbols for staleness tracking, and surfaces relevant memories based on what you're currently working on.
How much time does session memory save per day?
Without memory, developers typically spend 5–10 minutes re-establishing context at the start of each session. For a team running 10+ sessions per day, that's 50–100 minutes of wasted time daily. Persistent memory eliminates most of this re-priming overhead, letting sessions start productive immediately.

Nicola

Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.

Related Articles