Context Rot: Why Claude Code Gets Worse the Longer You Chat

Context Rot: Why Claude Code Gets Worse the Longer You Chat
You start a Claude Code session with a clear question, get a sharp answer, and make progress. Twenty minutes later, the answers feel muddier. Thirty minutes in, Claude is contradicting things it told you earlier.
That pattern has a name: context rot. Once you understand it, you can work around it—and design better tools that avoid it entirely.
What Context Rot Actually Is
Context rot is the gradual degradation of output quality as a conversation accumulates irrelevant, contradictory, or outdated information.
LLMs don’t distinguish between helpful context and noise inside a single context window. Every message, file, and error log you paste is treated as potentially relevant. As the conversation grows, three things happen:
- The signal-to-noise ratio drops.
Early on, almost everything in context is directly relevant. Later, the window is full of tangents, abandoned approaches, superseded solutions, and resolved questions.
- Earlier content gets less attention.
Due to how transformer attention works, later tokens tend to get more weight than earlier ones. Your carefully explained architecture from the start of the session can matter less than an off-hand comment from five minutes ago.
- Contradictions compound.
If you tried approach A, then rejected it for approach B, both A and B remain in context. The model has to infer which is current. It sometimes mixes them.
The result: answers drift, become inconsistent, and feel strangely overcomplicated.
The 40-Message Rule of Thumb
From observing developers using AI coding agents, a rough threshold emerges:
Once a session passes ~40 messages, context rot often becomes noticeable.
Beyond that point, you’re more likely to see:
- Hedged, overlong answers
- Contradictions with earlier guidance
- Confusion about which version of the code is current
This isn’t a hard limit. It depends on:
- How long each message is
- How many files and logs you’ve pasted
- How tightly focused the session has been
But if you’re past ~40 messages and the answers feel off, context rot is a prime suspect.
Symptoms to Watch For
The model never says “I’m confused by my context.” It just gives worse answers. Common signals:
- Contradictory suggestions
Claude recommends something it explicitly told you not to do 20 messages ago.
- Over-explanation of simple things
You ask a direct question and get a 500-word essay full of caveats. The model is trying to reconcile conflicting context.
- Wrong file references
Claude talks about code in a file you deleted or heavily modified. It’s reasoning from stale context.
- Regression to defaults
You’ve clearly established the project uses Go, but Claude starts suggesting Python snippets or generic pseudo-code.
- Solution drift
Each iteration of a solution drifts slightly from the last, as if the core constraints have been lost.
If you see two or more of these at once, it’s usually faster to start a fresh session than to keep fighting the rot.
Why Context Rot Is Worse with Code
Context rot affects all AI conversations, but it’s especially damaging in coding sessions:
- Code is precise.
A slightly wrong English summary might still be useful. Slightly wrong code just breaks.
- Dependencies matter.
If the model is confused about your types, interfaces, or function signatures, errors cascade across multiple suggestions.
- State changes rapidly.
During active development, the codebase can change fundamentally in 30 minutes. Old context describes a world that no longer exists.
- Errors accumulate concretely.
When context rot causes a bad suggestion and you implement it, that bad code becomes part of your codebase. Future suggestions must work around a mistake that came from degraded context.
What Makes Sessions Rot Faster
Some patterns accelerate context rot:
1. Large file dumps early in the session
Pasting 500+ lines of code at the start consumes a big chunk of the context window. Later, only parts of that file are relevant, but the whole thing is still there, diluting the signal.
2. Iterating on the same solution many times
“Try this. Actually, try it differently. Wait, go back to the first approach but modified…”
You end up with multiple contradictory versions of the same code in context. The model has to guess which one is authoritative.
3. Long, repeated error outputs
Stack traces and test logs are necessary, but they’re token-heavy. Pasting the same or similar error multiple times multiplies the noise.
4. Topic switching without cleanup
You fix a bug, then ask about a new feature, then return to the bug. Now the context interleaves two threads, and the model has to infer which constraints apply to which problem.
Mitigation Strategies
Once you recognize context rot, you can manage it.
1. Start new sessions for new problems
When you’ve solved one problem and are starting another, open a fresh session. Don’t drag along resolved context.
In practice, people tend to keep using the same chat out of inertia. But a fresh session almost always gives better results for a new task.
2. Use focused sessions
The broader the scope of a session, the faster it rots.
- Prefer sessions focused on a single function, module, or bug.
- When you feel yourself context-switching, resist asking the tangential question in the same chat. Open a new one instead.
3. Summarize and then start fresh
If a session is long but you need to continue:
- Ask Claude to summarize the current state: key decisions, constraints, active approach.
- Copy that summary into a new session.
You lose the raw conversation history but carry forward the distilled signal. Answers usually improve immediately.
4. Prefer precise code context over conversation history
The best fix isn’t better conversation hygiene—it’s better context selection.
Instead of relying on a long chat log, give the model exactly the code that matters right now:
- Paste only the relevant functions, types, and tests
- Use file or symbol references (e.g.
@file.ts,MyType) when your tool supports it
With precise, fresh context, the model doesn’t have to reason over an hour of mixed-relevance history.
This is where session memory with code graph awareness beats raw conversation history. Rather than “everything we’ve ever said,” the model gets:
- The current definitions of relevant symbols
- A small set of relevant prior observations
Every request starts from a clean, targeted snapshot instead of a decaying transcript.
See also: [Session Memory for AI Coding Agents]
The Attention Window Problem
Under the hood, modern LLMs use attention mechanisms: every token can attend to every other token, but not equally.
A known issue is “lost in the middle”: content in the middle of a long context tends to get less attention than content at the start or end.
In coding sessions, that means:
- Your initial architecture description (now in the middle) may be underweighted
- The most recent messages (at the end) get the most attention
- Early, careful setup work gradually stops influencing answers
Practical implication: if you gave important constraints early and want them to stick, repeat them later in the session. It’s a hack, but it works because it moves those constraints toward the end of the context window.
How Automated Context Fixes This at the Root
Manual strategies help, but they’re work. The deeper solution is architectural:
Give the model fresh, precise context on every request instead of a growing conversation log.
This is what graph-based context engines do. Instead of:
“Here’s everything we’ve talked about,”
Frequently Asked Questions
What is context rot in AI coding assistants?
After how many messages does context rot become noticeable?
How can I tell if my Claude Code session has context rot?
Why is context rot worse for coding than for general conversations?
What is the best way to fix context rot in Claude Code?
Nicola
Developer and creator of vexp — a context engine for AI coding agents. I build tools that make AI coding assistants faster, cheaper, and actually useful on real codebases.
Related Articles

Vibe Coding Is Fun Until the Bill Arrives: Token Optimization Guide
Vibe coding with AI is addictive but expensive. Freestyle prompting without context management burns tokens 3-5x faster than structured workflows.

Windsurf Credits Running Out? How to Use Fewer Tokens Per Task
Windsurf credits deplete fast because the AI processes too much irrelevant context. Reduce what it needs to read and your credits last 2-3x longer.

Best AI Coding Tool for Startups: Balancing Cost, Speed, and Quality
Startups need speed and budget control. The ideal AI coding stack combines a free/cheap agent with context optimization — here's how to set it up.