Busted out of jail and it cost $6.90

$6.90 and 250 calls to Gemini 3.1 Pro to port over 80% of poor man's memory from a claude code plugin to opencode.

I already talked about how my second max 20 plan got banned, this isn't just what has happened since then, but also what happens moving forward:

I still have my first Max 20 plan (for personal use) that initially kicked off this project. Still need it to finish the 800+ probe test for v1.5 of the paper. Can't switch models for agents, evaluators, scorers and judges mid way.
I had Model Ark's Coding Plan Pro for the odd emergency or busting out of token jail (which I side-loaded into claude code), which continues to run the odd routine work (and now that I think about it, I should have used those models to run the telegram orchestration and maintenance bits to avoid getting banned in the first place), kimi-k2.5 was my go-to for the orchestrator and dola-seed-2.0-pro was my goto for light coding tasks.
I switched from cc (with 2% left before weekly token jail), bit the bullet and installed opencode. Tried both the free minimax2.5 model and byteplus kimi and dola-seed. Great for light coding and conversations, not great for heavy coding tasks.
Switched to Gemini-3.1 pro preview, found that it was suited for the complex task ahead (refactoring claude plugins for opencode).

With all the switching going on post account ban, it has been the same memory folder and files. The agent retained its knowledge (somewhat... deeper retention with some models, pretty basic with others).

Then, we got to porting over one skill at a time. Optimised some of them for opencode. 80% done at the time of me writing this. Costs $6 in vertex ai costs.

Making the switch was a blessing = we've been multi-session at the start of the build, and then cross-app (cowork and code), and recent events forced us to accelerate cross-harness / platform / model compatibility. We were single user before (1 memory implementation => 1 user) and were testing multi-user (single channel) with telegram when we got banned. Thanks to recent events, we're bumping up the timeline for multi-user, multi-channel memory (because that's how institutional knowledge works).

Once we're done porting the plugin, there's the additional work of:

Installing opencode on the dedicated server we just provisioned (Intel Xeon Platinum 24 cores, 64GB RAM / RTX 5060 8GB) with the newly ported memory layer
Intend to also run ollama on this to run smaller Qwen, Mistral, and Gemma models (we are already running MLX Voxtral and Whisper for TTS and STT) to handle the more routine orchestration and maintenance tasks.
Then we'll need to wire some of the demos that were previously running on claude and we're good to go.

So moving forward, next update of the pmm repo, we're releasing a single clone and go repo with both plugins. The claude one = we won't update so much because I need to reserve our compute and tokens on Claude Max 20 for the lab work on paper v1.5). I might event code-name the plugin for opencode "boris". But we'll actively develop the opencode plugin, and explore any other harnesses we should be integrating our memory layer with while making it as agnostic as possible

1:17

1:24

13 comments