Building a Memory Bank for Claude Code (That Survives Session Resets)

Building a Memory Bank for Claude Code (That Survives Session Resets)
Claude Code starts every new session with a blank slate: no memory of last week's architecture decisions, no recall of that tricky auth edge case, no awareness of your team's preferred patterns.
This session reset problem typically costs 5–10 minutes of context-priming at the start of every session. Multiplied across a team running 10+ sessions a day, that's significant wasted time.
You can fix this by building a persistent memory bank. This guide covers three layers of memory for Claude Code and how to combine them into a system that actually works.
Why Sessions Reset: The Architecture
Claude Code doesn't run as a persistent background process. Each session is a fresh API conversation with its own context window. When the session ends, everything in that window disappears: files you opened, decisions you made, patterns you established.
This stateless architecture makes behavior predictable and debuggable — but it means any notion of memory has to be explicitly stored somewhere you can reload it later.
There are three practical layers for doing this:
- Static context: things that rarely change (
CLAUDE.md) - Dynamic context: things that change per-task or per-sprint (memory files)
- Automatic context: things discovered by indexing your codebase (vexp session memory)
Each layer serves a different purpose. The best setups use all three.
Layer 1: CLAUDE.md — Your Static Memory Bank
CLAUDE.md is a file that Claude Code automatically loads at the start of every session. Think of it as a briefing document Claude reads before doing anything else.
Anything you put here is available immediately, without re-explaining it every session.
What to Put in CLAUDE.md
Keep it focused on high-value, stable context. A well-structured CLAUDE.md has four sections:
Project overview (2–4 sentences)
Explain what the project is, the stack it uses, and where the main code lives. This is the single most valuable section because Claude can't infer project purpose from code alone.
Architecture decisions — document the why, not just the what. Example entries:
- We use CQRS pattern: read models are in
src/queries/, write models insrc/commands/. - Repository pattern for all database access — never query the DB directly from API handlers.
- All API endpoints must validate with Pydantic v2 models in
src/schemas/. - Async everywhere: all IO operations use
async/await.
Code conventions
Document formatting tools (Black, Prettier), test frameworks (pytest, Vitest), commit message style, and other conventions that don't change often but affect every suggestion Claude makes.
Things not to re-derive
Call out important facts Claude shouldn't waste time rediscovering:
- The auth middleware is in
src/middleware/auth.py— don't re-implement it. - All dates are UTC.
- The test database is separate (
TEST_DATABASE_URLenv var).
What Not to Put in CLAUDE.md
- File contents or code snippets — Claude can read files directly; pasting them wastes tokens.
- Complete API documentation — too long; link to a docs file instead.
- Information that changes frequently — put that in dynamic memory instead.
Target length: 300–600 words. Short enough to load fast, complete enough to be useful.
CLAUDE.md Placement
Claude Code looks for CLAUDE.md files in three places:
~/.claude/CLAUDE.md— global, applies to all projects..claude/CLAUDE.md— project-level, applies to this project.- Subdirectory
CLAUDE.mdfiles — applies when working in that subdirectory.
This hierarchy lets you have global preferences (your communication style, formatting preferences) and project-specific context (architecture, conventions) separately.
Layer 2: Dynamic Memory Files
Some context changes too frequently for CLAUDE.md (which you want to keep stable), but still needs to persist across sessions.
The Memory Directory Pattern
Create a .claude/memory/ directory in your project with these files:
MEMORY.md— main index with recent notable context.decisions.md— running log of architecture decisions.patterns.md— discovered patterns and solutions.debugging.md— recurring bug patterns and fixes.sprint.md— current sprint context (updated weekly).
In your CLAUDE.md, reference these files:
For current sprint context and recent decisions, read.claude/memory/MEMORY.mdand.claude/memory/decisions.mdat session start.
The Decisions Log
The most valuable memory file is decisions.md, which captures the why behind architecture changes. A good entry looks like:
2026-02-15: Switched from SQLAlchemy to asyncpg — SQLAlchemy async support was causing connection pool issues under load. asyncpg is significantly faster for our read-heavy workload. All queries now in src/db/queries/.
2026-02-28: Added Redis caching for user sessions — Session lookups were hitting DB on every request. Redis TTL set to 24h (same as JWT expiry). Cache invalidation in src/services/auth.py:invalidate_user_session().
When Claude reads this before your next session, it understands your architecture's current state — not just the code, but the reasoning behind it.
The Patterns File
patterns.md captures solutions to recurring problems:
Async pagination pattern
Always use cursor-based pagination (not offset) for consistent performance:
```sql
WHERE id > $1
ORDER BY id
LIMIT $2
```
Error handling in API handlers
Always wrap service calls in try/except and map domain exceptions to HTTP status codes. See src/api/utils/errors.py for the mapping.
Updating Memory Files
The key habit: after every session where you discover something useful, update the relevant memory file before closing. This takes 2–3 minutes and saves 5–10 minutes at the next session start.
You can even ask Claude:
Before we end, write a summary of the architectural decisions we made today to .claude/memory/decisions.md.Layer 3: Automated Session Memory with vexp
The first two layers require manual maintenance. They work, but they're only as good as your discipline in updating them.
vexp adds a third layer: automatic session memory tied to your codebase graph.
How vexp Session Memory Works
When you use vexp tools during a session, it automatically captures observations — what code you explored, what decisions were made, what patterns were found. These observations are:
- Linked to code symbols (function names, class names) — so if the linked code changes, the observation is flagged as potentially stale.
- Searchable across sessions —
search_memory("authentication bug")finds relevant past observations. - Surfaced automatically — when you run
run_pipelineon a task, related memories from previous sessions are included in the results.
This means context from past sessions flows into new sessions automatically, without manual effort.
Manual Observations
You can also save observations explicitly:
```bash
save_observation("Switched to asyncpg — all queries now in src/db/queries/")
```
Or using the observation parameter in run_pipeline:
```bash
run_pipeline({ task: "fix session caching bug", observation: "Redis TTL mismatch — JWT is 24h but Redis was set to 12h" })
```
The Staleness Signal
When code linked to a memory observation changes, vexp flags the observation as potentially stale. If you saved "auth validation is in src/middleware/auth.py" and then that file was refactored, the observation gets flagged. You'll see it in results but with a staleness indicator — a useful signal that your memory bank needs updating.
Putting It Together: The Complete Setup
Here is the full setup, in priority order.
Step 1 — Create global CLAUDE.md
Create ~/.claude/CLAUDE.md with your preferred communication style, global formatting preferences, and tools you use across all projects. This takes ~15 minutes once and benefits every project immediately.
Step 2 — Create project CLAUDE.md
Create .claude/CLAUDE.md with project overview, tech stack, architecture decisions, and code conventions. Aim for 300–600 words. This is the highest-ROI investment for any specific project.
Step 3 — Create memory directory
Create .claude/memory/ and start with MEMORY.md and decisions.md. Add patterns.md and sprint.md as the project grows. Commit these files to the repo so the whole team benefits.
Step 4 — Install vexp for automated session memory
Install and index your workspace:
```bash
npm install -g vexp-cli
vexp-core index
```
Then add vexp to your Claude Code MCP config:
```json
Frequently Asked Questions
Does the memory bank survive Claude Code updates or reinstalls?
Yes. Everything in the three-layer stack lives in plain text files — your global CLAUDE.md, project CLAUDE.md, and memory directory. These survive Claude Code updates, reinstalls, and machine changes. The vexp index is local-only (index.db is gitignored), but observations saved with save_observation persist in vexp's own database independently.
How much overhead does MEMORY.md add to each session?
A 200-line MEMORY.md adds roughly 1,500–2,000 tokens per session start. That sounds costly, but compare it to spending 3,000–5,000 tokens re-explaining architecture the AI already worked through in a previous session. The memory bank pays for itself in session two.
Can I use this approach with Cursor, Windsurf, or other AI coding agents?
The CLAUDE.md pattern is Claude Code-specific for auto-loading, but the concept works in any agent that supports project configuration files. The vexp session memory layer works across all 12 supported agents — Claude Code, Cursor, Windsurf, GitHub Copilot, Continue.dev, Augment, Zed, Codex, Opencode, Kilo Code, Kiro, and Antigravity.
What if MEMORY.md grows too large?
Claude Code loads MEMORY.md at session start but truncates content after 200 lines. Keep it to a concise index of key facts and link to separate topic files (e.g., memory/authentication.md, memory/api-patterns.md) for detail. The index-plus-detail pattern keeps the primary file within limits while preserving all context.
Does vexp session memory work offline?
Yes. The vexp daemon (vexp-core) runs locally and observations are stored in a local database. The only network calls are initial activation and optional telemetry (which can be disabled). Once installed and indexed, session memory works entirely offline.
Frequently Asked Questions
Does Claude Code remember anything between sessions?
What is a memory bank for Claude Code?
How does CLAUDE.md work as a memory layer?
What is automated session memory for AI coding agents?
How much time does session memory save per day?
Nicola
Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.
Related Articles

Vibe Coding Is Fun Until the Bill Arrives: Token Optimization Guide
Vibe coding with AI is addictive but expensive. Freestyle prompting without context management burns tokens 3-5x faster than structured workflows.

Windsurf Credits Running Out? How to Use Fewer Tokens Per Task
Windsurf credits deplete fast because the AI processes too much irrelevant context. Reduce what it needs to read and your credits last 2-3x longer.

Best AI Coding Tool for Startups: Balancing Cost, Speed, and Quality
Startups need speed and budget control. The ideal AI coding stack combines a free/cheap agent with context optimization — here's how to set it up.