For any agent, memory is the most fundamental pillar. Yet, most users take the LLM that runs the agent as the source of its memory.
Yes, we know LLMs "know everything." They are trained on the literal text of the internet. But when it comes to actual work, 99% of this knowledge is useless—or even problematic. We value the intelligence and reasoning abilities of the LLM, but to work as an agent, we need a different 'type' of memory. We should learn not to rely on the LLM for all aspects of our agent. The agent must build its practical knowledge from experience.
💾 The Concept of "Memory Forms."
In our agent design, I use the term 'Memory Form' to describe anything that helps the agent produce reliable, reproducible behavior. It's not just text in a file; it's structure.
* Knowledge Files (`.claude/knowledge/`): Static, reference memory. Or any other files that the agent has access.
* MCP Tools: Capability memory (remembering 'how' to do something new).
* Hooks: Procedural and reflexive memory (remembering 'when' to do something).
The key idea is simple:
Use the LLM for what it's good at (intelligence and reasoning), and don't try to address all the agent's fundamental requirements with the LLM and its pre-trained memory.
🏛️ The Three Pillars
Now that we are familiar with the concept of memory forms, we define the pillars of our agent as:
1. Memories: The context injection layer (Working Memory) — specifically the local `CLAUDE.md` layer.
2. Hooks Ecosystem: A growing control layer that remembers to inject hints, directives, and reminders at the best times during your work.
3. Intentions: An MCP layer that dynamically generates instructions based on pre-set intentions, confining the agent to what it can do in practice.
In this post, we focus exclusively on the first pillar: 'Memories', specifically the Working Memory functionality of `CLAUDE.md` files.
(We will dive into Hooks and Intentions in future posts).
📂 The CLAUDE.md Hierarchy
Before we look at how memory works dynamically, we need to differentiate between the four types of `CLAUDE.md` files in our system:
1. `~/.claude/CLAUDE.md` (Global): The "User Context." This applies across 'all' your agents. It knows who you are and your universal preferences.
2. `./CLAUDE.md` (Agent): The "Identity." This defines who 'this specific agent' is (e.g., a Coding Assistant vs. a Legal Aide).
3. `./.claude/CLAUDE.md` (Brain): The "Manual." This tells the agent how to operate its own internal machinery, mainly during self-improvement periods (hooks, tools, etc.).
4. `./**/CLAUDE.md` (Local): The "Dynamic Working Memory".
It is this fourth category—the Local `CLAUDE.md` files—that solves the day-to-day memory problem.
🧠 The Living Brain: How Local CLAUDE.md Becomes Working Memory
When you see a markdown file in a repo, you usually think of it as documentation—static, dead text that you read once and forget.
In our system, Local `CLAUDE.md` files are alive.
They function like biological tissue: they absorb nutrients (information), structure it (planning), support action (execution), and then regenerate (condense). Often, I define a working paradigm for my agents (e.g., OPEV(R+): observe, plan, execute, verify, report, condense, etc.) and include the local CLAUDE.md files and their roles in working memory within that definition. Here is how this "Living Memory" cycle works, and why it makes your agent infinitely smarter.
🧽 1. OBSERVE: The Sponge Phase
When we start a complex task, the agent doesn't just start working immediately. It starts by 'Absorbing'.
It launches parallel tasks to read the current files, search the web, and interview the user for more context. But where does all that information go? It doesn't just vanish into the chat history.
The agent writes it to the local `CLAUDE.md` file in the directory it's working in.
* "I found this pattern in `utils.py`..." -> Written to CLAUDE.md
* "The user wants a 5-minute cache TTL..." -> Written to CLAUDE.md
* "This library has a known bug..." -> Written to CLAUDE.md
Suddenly, that directory's `CLAUDE.md` isn't just a readme; it's a 'scratchpad of relevant context'.
💭 2. PLAN: The Thinking Ground
Now that the context is captured, the agent uses the same `CLAUDE.md` file as a Thinking Ground.
It doesn't keep the plan in its "head" (the context window). It explicitly writes a detailed plan into the file:
* "Step 1: Create client file."
* "Step 2: Update report."
* "Verification: Test with `curl`"
This is crucial because "I don't remember - I write down." By externalizing the plan, the agent frees up its cognitive resources for execution.
🛠️ 3. EXECUTE: Robust Working Memory
This is where the magic happens. When the Main Agent launches a sub-task (a "Task Agent") to do the heavy lifting, that Task Agent does not have access to your main chat history. It is born fresh, with no context.
But because we wrote everything down, the system works for us.
Automatic Context Injection: As the agent reads or writes new files, all nested `CLAUDE.md` files (Agent, Brain, and Local `CLAUDE.md` files) are automatically prepended to the file at hand.
The agent does not need to read every local `CLAUDE.md` file independently. They are automatically prepended to all files it reads, as long as there are local `CLAUDE.md` files. Every local `CLAUDE.md` file confines its information to cover all the files and subdirectories it resides at the same level as itself.
This means the Task Agent instantly knows:
* The user's preferences (from the Observe phase).
* The exact plan (from the Plan phase).
* The constraints and patterns.
The `CLAUDE.md` file acts as 'Context Permanence'. Even if the main chat session gets long and fuzzy, the local file is crisp and clear. It anchors the agent's reality.
🌬️ 4. CONDENSE: The Breath Out
Finally, when the work is done, the agent doesn't leave a messy scratchpad. We Condense.
The agent reviews the `CLAUDE.md` file:
* Temporary info (execution logs, raw notes) -> Deleted.
* Permanent learnings ("We use Redis for caching") -> Moved to Knowledge (`.claude/knowledge/`) or upper `CLAUDE.md` files in the nesting hierarchy.
* Directory context ("This folder contains caching logic") -> Kept in CLAUDE.md.
The file returns to "Maintenance Mode"—clean, concise, and ready for the next job.
🏎️ Driving the Agent
A good agent will learn (by better hooks and other good MCPs) to use its working memory layer more efficiently and effectively. So, by itself, it is not yet complete, since all the instructions for using its local `CLAUDE.md` in this specific manner are also in its `CLAUDE.md` files.
To maximize adherence to this philosophy, we will also need other mechanisms (Hooks and MCPs) to drive the agent in the right direction, using these files as a dynamic working memory.
There is also a powerful secondary benefit. As we evolve our Hooks and Intentions (the other two pillars), these local `CLAUDE.md` files become the perfect context source for the "internal LLM calls" running inside our control layer. Think of it as giving your agent's 'subconscious' (the automated scripts running in the background) access to the same working memory as the conscious agent.
Also, remember that this is a design choice, and every user can customize how their agent's local CLAUDE.md files are used effectively. What is constant is what we are doing and not how we do it. As long as the agent uses the available space in local CLAUDE.md files as a dynamic, ever-evolving working memory to support information retention, the details can be adjusted. 💡 Why This Matters
By treating local `CLAUDE.md` files as Dynamic Working Memory, we solve the Agent Memory Problem.
1. Context is Persistent: It survives beyond the immediate chat window.
2. Tasks are Smarter: Sub-agents inherit the useful context without the main agent repeating it.
3. Self-Evolution: Your agent builds its own knowledge base as it works.
This is how we move from "chatting with a bot" to "engineering a digital mind." We stop hoping the LLM remembers, and we start building a system that cannot forget.
- Next time, we can look into other ways to improve our agents' various memory forms.