Trusting Agents More · AI Automation Society

Trusting Agents More

GPT 5.4 dropped last week, and it felt like the right time to talk about something I've been thinking about: how much more we can actually trust AI agents to finish a task now, especially with the newest models.

The real power of these models right now isn't in a chat window. It's in setting them up inside your terminal or an IDE, where they get actual freedom to do things they couldn't do before (file access, running code, hitting APIs). That's where the magic happens.

MCP server hype has died down, but I think it is a great time to talk about them again.

If you don’t know what that is, it is basically a custom server an AI agent can search to access all of the API endpoints of specific software.

Here's the thing, you don't really need one handed to you anymore. Writing custom instructions for how to use a specific software platform's API lets you basically build your own. You're just treating each platform like a tool the agent can reach for when it needs to.

Now, safeguards still matter. You probably don't want your agent taking out a $5 million loan on your behalf (clearly).

But there's a huge range of tasks where it can get to 90, maybe even 95% done without you touching anything. And that last 5-10%?

That's where you step in for the finishing touches instead of doing the whole thing from scratch.

TLDR: I am realizing I can use claude code for more shit.

5 comments