I just red a study I had to share with you guys. Eight months ago, the best AI agent in the world could only complete 2.5% of real freelance projects to a client-acceptable standard. Today that number is 16.1%. The Remote Labor Index tests AI agents on actual commissioned work (3D/CAD, architecture, video, web dev, and more), with every deliverable judged by human evaluators against a professional's paid output, not a benchmark score. The new leader is Anthropic's Fable 5, roughly double Opus 4.8 (8.3%) and well ahead of GPT-5.5 (6.3%). Here's the part I find reassuring: the researchers also tried replacing human evaluators with an AI judge. It overestimated the newest models' performance by up to 3x. Turns out we can't yet trust an AI to reliably judge another AI's work, human evaluation is still doing the heavy lifting. So no, humans aren't out of the loop yet. But going from 2.5% to 16.1% in under 8 months is the kind of curve that should have people paying attention, regardless of industry. Curious how others here read this: signal of what's coming, or still early enough not to worry? Source: https://safe.ai/blog/significant-increase-in-digital-labor-automation