I built a coding workflow called Plumbline (https://github.com/BytesFromToby/plumbline) — five AI-agent "skills" that take a feature from idea to verified code. The names lean on a home-building metaphor on purpose: an architect defines what to build, a foreman plans how, a builder writes the code, an inspector signs off. A plumb line is the weighted string a builder hangs to find true vertical — the one reference everything else is checked against. In my workflow, the spec is that line. One way to wire this is a chain: architect calls foreman, foreman calls builder. I deliberately didn't. The agents never call each other. They coordinate entirely through files and folders — and that one choice turned out to be the most important in the whole thing. Chaining them causes two problems: tight coupling, where a failure halfway means restarting the whole pipeline; and — worse with LLMs — a telephone game, where each agent paraphrases the last one's output until the builder is working from a summary of a summary of what I actually asked for.
# The fix: the filesystem is the interface
Every stage reads and writes plain files in a known layout:
Planning/specs/[feature]_spec.md ← architect writes this; it's the source of truth Planning/blueprints/[feature]_BP.md ← foreman writes the build plan output/inspect/ ← inspector drops evidence + a PASS/FAIL stamp
A stage's input is whichever files already exist; its output is whatever it writes. No call graph. The builder runs after the blueprint exists — sequenced by data dependency, not because the foreman told it to. A single CLAUDE.md in each project declares the shared contract (test command, where specs live, the rules for a change) — the schema that lets independently-run agents agree on where everything is without ever talking. # Why it's better than I expected
Every stage runs standalone — the pipeline is a suggestion, not a cage. Intent stops drifting: every stage reads the original spec, not a retelling of it, so the builder checks against what I actually asked for and stops if the plan contradicts it. Independence becomes structural: my inspector only reads the finished files and never saw the build, so its neutrality is enforced by isolation, not good intentions. And everything is inspectable — when something breaks I read the files, not a black-box conversation.
# The transferable lesson
For any multi-agent system: don't make agents depend on each other — make them depend on shared, persistent state. Files are slower than direct calls, but they buy restartability, auditability, and a single source of truth that can't drift. For agents that love to paraphrase, one canonical artifact everyone reads from is what keeps the whole thing honest.