ICM's premise is that structure replaces orchestration: folders and markdown carry each stage's context, and one agent reads the right files at the right moment. The paper assumes that agent is capable. I wanted to see if ICM holds when the agent is a small local model you run yourself.
It does, by leaning on two things ICM already gives you.
Stage-scoped context becomes injection. In ICM the agent roams the workspace and opens what it needs. A model served through Ollama is not that agent. It is an inference endpoint that takes a prompt and returns text, with no file access and no navigation loop, so it cannot roam the folders at all. The engine reads the files and injects each stage's context into the prompt instead. Same principle, each stage sees only what it needs, delivered by code rather than fetched by the model.
"Scripts handle what doesn't need AI" becomes an oracle per stage. ICM keeps the mechanical work out of the model. I extend that one step: every generative stage is checked by a deterministic oracle (for code, the compiler and its tests), because a small model proposes well but can't verify itself. Reliability stays in the structure, not the model.
That is the whole port. One stage one job, plain-text artifacts, factory vs product, human-reviewable files: all carry over unchanged. ICM and MCP stay complementary, as the paper notes, with the folder structure deciding context and the stage's tools exposed over MCP.
The result is a frontier-free assistant: the same methodology, running on hardware you own, for tasks that are narrow and checkable.
Two repos, MIT, pure stdlib (you bring Ollama):
I also have a coffee test example that is way more simple than the coding assistant.
Thanks for the methodology, Jake. It ported cleanly.