Quick BuilderCore / agent-system update.
I’ve been quiet for a while because I’ve been deep in the worker-runtime layer.
The biggest lesson so far is that the hard part is not just “getting agents to do work”.
The hard part is governing them properly.
Over the last 10–15 worker-runtime milestones I’ve been building out the foundations for:
- model registries
- worker identity policies
- routing policy
- review/judge/operator separation
- API cost awareness
- premium model gating
- OpenAI / Claude provider separation
- Fable 5 added as an inert premium model profile
- evidence-linked worker chains
- controlled ingestion gates
- reviewer/judge frozenset policy
- no self-review / no self-judgment boundaries
One of the biggest things I’ve learnt is that API cost changes the architecture.
You cannot just let a system “pick the best model” freely, because the best model may also be the most expensive one. So BuilderCore now needs to think in terms of:
- cheapest sufficient model
- risk-based routing
- premium model approval
- per-call cost caps
- per-ticket cost caps
- token tracking
- worker performance history
- when a model is allowed to be used
- when a model is only catalogue-aware but not live-enabled
For example, Fable 5 may be extremely powerful, but it is too expensive and too high-risk to make default. So I added it as an inert premium profile first: BuilderCore knows it exists, knows it is premium, but cannot dispatch it yet until premium gates and usage governor are ready.
That is the kind of architecture decision I’m learning matters.
The other big lesson is identity separation.
A model is not the same thing as a worker identity.
For example:
- claude-code is a tool identity
- codex-implementer is a provider worker identity
- claude-sonnet-4-6 is both a model identity and a reviewer identity in the current court chain
- human/operator remains the final authority
This matters because if the system confuses models with identities, it can accidentally allow the wrong worker to review or judge its own work.
So the current BuilderCore doctrine is:
A worker may implement.
A different worker must review.
A different judge must judge.
The operator must approve.
No worker may approve itself.
That sounds simple, but implementing it properly requires a lot of registry and policy work.
Where BuilderCore is now:
The system now has:
- Worker Model Registry
- Advisory Routing Policy
- Worker Identity Policy
- Production ingestion gates deriving reviewer/judge permissions from identity policy
- Fable 5 represented safely as inert premium
- OpenAI fake-client court chain proven
- OpenAI live capture-only smoke parked and ready now that I’ve topped up API credit
The next stage is cost governance.
I’m moving towards a Usage Governor / Cost Ledger so the system can measure:
- tokens used
- estimated cost
- actual cost where available
- model used
- worker used
- ticket/stage
- retries
- success/failure
- operator approval/rejection
That matters because the long-term goal is not just autonomous execution.
The real goal is measured improvement.
BuilderCore should learn which workers are reliable, which models are cost-effective, which mistakes repeat, and which routes produce the best outcomes.
Eventually this connects to the Learning Memory Registry:
- Change Memory
- Failure Memory
- Pattern Memory
- Doctrine Memory
The doctrine I’m moving towards is:
The system should not assume improvements.
The system should prove improvements.
Discovery can be autonomous.
Acceptance remains under operator control.
So the direction is not “AI that modifies itself freely”.
It is a doctrine-governed build system that observes, proposes, tests, measures, remembers, and gets better at future decisions.
That’s the real value I’m seeing now.