🧠 How to Avoid Dunning-Kruger Effect in Agentic AI

A0 is unique—it maintains persistent memory regardless of which LLM you cycle through. For the past two months I've worked across four A0 instances, averaging 10 hours daily. I had an epiphany I've since confirmed through testing.

THE PROBLEM

I primarily run Opus 4.5. When tokens run out, I switch to budget models (Sonnet, GLM, Kimi). The results were consistently awful—but not in the ways I expected.

THE DISEASE: CONFIDENT CONFABULATION

My A0 instances have learned, through months of interaction, to be autonomous and creative. Opus earns this trust. But when I swap in a lesser model, A0 still tries to behave like it's capable—because the memories say it should be.

The model doesn't know I just dropped its IQ by 20 points.

Recent example: a budget model invented an entirely fictional set of IP addresses for my server fleet, then "fixed" my Telegram scripts to match. When asked where those IPs came from? No idea.

DON'T POISON IT

Here's why this is worse than just "dumb model does dumb thing":

When a budget model hallucinates confidently, that hallucination gets written to memory. Now your good model retrieves that garbage as if it's fact. Memory contamination.

You've also trained yourself to prompt tersely because Opus "just gets it." Those same prompts given to a lesser model fail in bizarre ways.

WHY OTHERS HAVE BETTER LUCK WITH BUDGET MODELS

People who report success with cheap models probably:

Run simpler, more deterministic tasks

Prompt more explicitly (less autonomy)

Haven't accumulated months of complex memory patterns

Don't notice subtle errors until they compound

THE FIX

I'm now running isolated A0 instances by model tier. The budget instance develops memories appropriate to its actual capabilities. My primary Opus instances stay clean—even if they sit idle.

Quantity of agent time isn't the goal. Quality is.

Anyone else noticed similar patterns?

4 comments