The past few weeks I've been heads down working on a server that uses models to match Anthropics Opus models, for accuracy and agentic ability.
I used recent research "break throughs" to test the weight of research claims by using local models like Kimi 2.6, Deepseek V4, Gemini, Qwen and a couple others and directly hitting the APIs.
I've built a CLI harness to connect to this server to have the ability to use this server agenticly on my projects.
I'm very surprised by the results. I've wrote a lot off very complex tests to compare my server against 4.7 xhigh (I haven't ran against 4.8) and I tried to get 4.8 to trick my server.
(I take this with a grain of salt but 4.8 had a difficult time tricking my server.)
Well, I'm very pleased to say. My server that I've named Quorum, performs exceptionally well. And I'm working on making this my primary driver for using AI.
Instead of ~$2/ million tokens. I'm paying around 5-10 cents and getting the same quality.
The best part is that ICM is the foundation of my orchestration layer. And the speed is impressive.
To save money and still get the same power. At the moment, I have opus 4.8 delegate all of its work to my server. And 4.8 just checks the work and tests the work 😜
I'm super excited about this and just wanted to share!
Thanks all!