Are your CLAUDE.md files guzzling up your tokens?
By Magnus Hultberg • 21 February 2026
Last edited: 21 February 2026
This is borderline daft, but I recently grokked how Claude Code actually uses the CLAUDE.md files (AGENTS.md for other coding agents like Codex) that act as "memory" in a given project. It's obvious in hindsight, but to provide working context they are sent in their entirety along with every request you make. Which specific CLAUDE.md are added in depends on where in your folder structure the agent is working, but the top level CLAUDE.md as well as .claude/CLAUDE.md are always used I believe.
This morning I had a "Wait, what?" moment. So my 700-line plus project memory files were being sent with every "do this basic task" request? That seemed... wasteful.
Time for an experiment.
I've read about people treating CLAUDE.md like a "library index" - keeping it minimal with just links to detailed docs that Claude reads only when needed. Figured I'd try it.
In collaboration with Claude we refactored the growing memory files together.
What we achieved:
📉 Top level CLAUDE.md: 482 lines → 157 lines (67% reduction)
📉 Total context: 730+ lines → 442 lines
Here's how. We pulled detailed content out into separate files:
- SPECIFICATIONS/ # Active feature specs
|-- ARCHIVE/ # Completed specs (historical)
- REFERENCE/ # Implementation docs (loaded on-demand)
|-- testing-strategy.md
|-- troubleshooting.md
|-- technical-debt.md
|-- ...
The top level CLAUDE.md now just contains:
- Quick project overview
- Essential architecture patterns
- Links to the detailed stuff
Does it actually help?
Well. Who am I to know. But those ~300 saved lines aren't loaded with every request anymore. Claude pulls in the detailed docs only when they're relevant to what I'm asking.
Cheaper per request (I think?), clearer context (for sure!), and my SPECIFICATIONS/ folder now shows what I'm actually working on (currently empty - no active work in progress).
Discussion around suitable structure: about 30 minutes.
Actual time for Claude to act on the agreed approach: 2 minutes.
Probably worth it if you've got growing context files.