Leigh Rogers

Hi guys. New to Agent Zero but familiar with what I’d consider ‘normal’ token usage based on experience with n8n, bolt, etc. After spending the day with A0 (using requesty for LLM APIs), I’m constantly seeing single messages eating over 100k tokens across 10+ api calls - which gets huge very quickly. I’ve tried a lot of potential solutions - from adding the ‘true’ flag for caching in the LiteLLM input field - to trying to control api call volume and token usage via global behaviour instructions and going down rabbit holes trying to sort something out via Perplexity at the VM level. Is this standard with A0? That is - it’s just a token hungry monster that sends full system prompts, verbatim chat thread copy, summarised chat thread copy and more in every API call associated with every single message? I’m just trying to wrap my head around this because it just doesn’t make sense. And that could simply be down to the fact that I’m a ln A0 noob. Any guidance appreciated 👍

New comment Feb 25