Messing around with Local Memory systems

Is anyone else building local memory systems and how would you know if your AI's memory system silently broke?

My Local Memory system failed for three sessions before I noticed.

Not loudly. No errors, no warnings. Claude still responded. Queries still returned results. Everything looked fine.

What actually happened: the retrieval index I'd built (the thing Claude uses to remember what I know, what I've decided, what's in progress) had a corrupted cache file. JSON had gone invalid. Every query was silently returning degraded results. The model was answering from partial context and I had no idea.

I found out because an answer felt slightly off. I went digging. The cache had been broken for days.

That's the failure mode nobody talks about when you build on top of AI: the silent ones.

I posted about the five-layer memory system back in May (here): CORTEX for structured state, PALACE for long-term memory, CORPUS for document retrieval, GRAPH for codebase navigation, INGEST for pipelines.

That system is what broke.

I'd built the memory layers but nothing watching the memory layers. A building with no facilities manager. Everything looked like it was running until the one piece holding it together quietly failed.

The fix wasn't complicated. But it needed a different way of thinking about the setup.

Buildings have a facilities manager. Someone who shows up before anyone else, checks that the systems are running, flags anything wrong, and routes incoming deliveries to the right desk, so the people doing the actual work never have to think about whether the infrastructure is functioning.

That's what I built. Not a new AI. A facilities manager for the AI system I already had.

The FM runs locally, at session start, every time.

It's a small local model, Qwen3 running via Ollama, dispatched through a shell script. Costs nothing to run. Takes about 4 seconds.

At every session start it checks six things: index file integrity, last sync completion, sync staleness, session log continuity, knowledge graph status, vector database currency.

If something fails, it writes to an inbox file and the finding injects into Claude's session context before anything else happens. Not a notification I might miss. The first thing Claude reads.

If everything's fine: one line. NOMINAL — all systems clean.

It also handles intake. Incoming files get classified, checked for duplicates, flagged if suspicious, preprocessed before Claude touches them. Classification and dedup don't need intelligence. They need speed and consistency. Local model handles volume at near-zero cost. Claude handles synthesis at premium cost.

Before: I'd open a session and start working. Occasionally something felt off. I'd investigate after the fact.

After: every session starts with a confirmed infrastructure status. Problems surface before work begins.

The system went from something I hoped was working to something I know is working.

If you're building any persistent AI system (research assistant, coaching tool, operator OS, anything with memory that carries across sessions) you have the same silent failure risk I had.

I put together a breakdown of the pattern (local model, health check script, session hook, signals file) here: arielortiz.me/facilities-manager

None of it requires the primary LLM. None of it costs API tokens. All of it runs before Claude opens.

A system that knows when it's broken is fundamentally different from one that doesn't.

Tell me about your systems! Where in your stack are you just trusting things are working without checking?

4 comments

Clief Notes

skool.com/cliefnotes

What we give away free beats most paid courses. Build durable AI systems with a Marine vet and Edinburgh researcher. 40+ lessons, growing.

Leaderboard (30-day)

🔥

+2019

+1652

+1260

+593

+531