I looked at an AI leaderboard so you donāt have to...
š Quick translation: itās basically āTop Trumps for LLMsā, but instead of a dragon vs a robot, itās:
- memory
- reasoning
- tool use
- how often it doesnāt faceplant on a multi-step task
š” What the table roughly suggests:
- GPT-5.2 xhigh is the all-rounder, best overall scores in this screenshot
- Claude Opus 4.5 is the āclose second that often writes like it has read a bookā
- Gemini 3 Pro Preview is the āI can remember your entire business planā option because the context window is enormous
- The āmediumā tiers are where you start seeing the classic symptoms:
- randomly shortening your copy
- ignoring half the brief
- confidently inventing details
- acting like schema markup is a vibe not a format
š§ What this means for you (and your future sanity):
- If youāre doing multi-step tasks (funnels, automations, CRM logic, debugging, structured content), choose the model that scores well on agentic/tool use
- If youāre pasting big inputs (long pages, brand rules, FAQs, 18 tabs worth of chaos), prioritise context window
- If youāre writing persuasive content, remember:
- leaderboards measure test performance
- your audience measures whether it sounds like a human who understands humans
šÆ My practical rule of thumb:
- Use the ābig brainā tier when the job has consequences
- Use the fast one when youāre brainstorming and you donāt mind binning 30% of it
š Tiny challenge for today:
- Pick one task you keep avoiding (a sales page section, a follow-up sequence, a messy automation)
- Run it through a higher-tier model once
- Compare time saved, clarity gained and how many times you mutter āfor Godās sakeā at the screen
P.S. If you want, comment MODEL ME and tell me what youāre using AI for (content, automations, offers, SEO). Iāll tell you which model mode to use for that job and the prompt structure that stops it going off piste.