The AI-boom’s Unspoken Secret: Are the Reasoning Models Any Better?
I’ve been watching the hype around reasoning-capable AI, and it all seemed so polished – intelligent, methodical, reasonable. We’re told these models can break down tough tasks into logical steps, produce a chain‑of‑thought and edge us closer to real “thinking.” But recent research throws a damp squib on that image. Apple’s "Illusion of Thinking" In June, Apple’s lab published “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity.” arxiv apple They pitted so‑called reasoning models (LRMs) against clean, logic puzzles – like Tower of Hanoi, River Crossing and Blocks World – with programmable complexity. Findings? These models perform well up to a point, but once complexity hits a threshold, their accuracy collapses to zero. They don’t ramp up; they give up. Some even stop using their token budget midway, essentially throwing in the towel. arize Journalists at The Verge reported the same: “LRMs face a complete accuracy collapse beyond certain complexities.” theverge Enterprise Echoes: Jagged Intelligence Over at Salesforce, researchers have coined the term jagged intelligence to describe this erratic behaviour. salesforce Yes, reasoning models can shine on medium‑difficulty tasks, but their performance is inconsistent, with sharp drops in capability exactly where reliability matters in real‑world settings. Enterprise General Intelligence remains aspirational. As Salesforce frames it, businesses need capability + consistency, not just impressive bursts of intelligence. itpro