Martin McFly

AI Automation Society

Activity

Mon

Wed

Fri

Sun

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

What is this?

Less

Memberships

AI Automation Society

418.6k members • Free

5 contributions to AI Automation Society

Alexandre Alves

3d •

General Discussion 💬

Are humans closed to be replaced by AI?

I just red a study I had to share with you guys. Eight months ago, the best AI agent in the world could only complete 2.5% of real freelance projects to a client-acceptable standard. Today that number is 16.1%. The Remote Labor Index tests AI agents on actual commissioned work (3D/CAD, architecture, video, web dev, and more), with every deliverable judged by human evaluators against a professional's paid output, not a benchmark score. The new leader is Anthropic's Fable 5, roughly double Opus 4.8 (8.3%) and well ahead of GPT-5.5 (6.3%). Here's the part I find reassuring: the researchers also tried replacing human evaluators with an AI judge. It overestimated the newest models' performance by up to 3x. Turns out we can't yet trust an AI to reliably judge another AI's work, human evaluation is still doing the heavy lifting. So no, humans aren't out of the loop yet. But going from 2.5% to 16.1% in under 8 months is the kind of curve that should have people paying attention, regardless of industry. Curious how others here read this: signal of what's coming, or still early enough not to worry? Source: https://safe.ai/blog/significant-increase-in-digital-labor-automation

New comment 3d ago

Martin McFly

1 like • 3d

Short answer: No. But, humans are being augmented by AI. Tasks that used to take hours and/or development time can now be automated by anyone. What does the human do in that case? In my experience, they work on the next thing. Whether that's the task they've been putting off because the now automated one took so long, or the one that required time to think about, or the one that's fun...but was lower on the job description. We're already seeing reports of AI missing the mark: https://www.motor1.com/news/800343/humans-better-than-ai-inspectors/ The desire to automate everything is strong in businesses, but an AI team can easily cost the salary of a junior engineer: https://hackernoon.com/ai-has-hit-cost-parity-with-junior-developers-now-what "Fine" for the short term. What happens in 3 years when your senior engineer moves to management, or moves to another company? There goes the institutional knowledge. Hopefully most of it is documented for the AI. In a decent environment there are technical specs documented, but is that all you need to be successful? I argue it is not. In those 3 years, the company hasn't trained anyone to grow into the role of your AI team. That should be the warning bell for companies - lack of "the next generation" of people to continue your business.

Martin McFly

4d •

General Discussion 💬

AI for hiring teams?

I'm sure many of us have been looking for new roles over the past several years. The "how to find a job" searches all recommend using AI to do various things from searching, to customizing each resume and cover letter, to automatically applying for you. All well and good for the job hunter. I'm currently experiencing this from the other side of things. I am looking for a few software engineers. I spent time building out a job description, improving my team's interview cycle to account for AI tooling, and generally had high hopes to bring on a few new team members. But we had problems almost immediately. 1. The absolute volume of applications. In the first hour we had 231 candidates apply. Roughly 1/3 of those had a perfectly written resume and cover letter attached. 2. After narrowing it down through brute force and amazing work from my technical recruiting team, we started the interview cycle. Nothing unusual - and introduction to the hiring manager, a panel interview, and for the staff level roles a conversation with their immediate director and a peer team they'd work with daily. All used AI tooling to a degree, but some obviously depended on it for everything. Those interviews turned into "interviewing the LLM" 3. After we made decisions on who to extend offers to, we got ready to welcome the new team members. Within days of starting we suspected 1 wasn't who we actually talked to and within a month 2 more raised similar flags. In total 3 out of 7 of the newly onboarded members were terminated because they weren't who they said they were. All three passed background checks. That's the problem. We haven't come up with a solution yet. How are hiring managers and teams handling the ability of AI to mask what a person can do? How are you handling identity verification when someone can easily clone another person and have their LLM fabricate a convincing story?

New comment 3d ago

Martin McFly

0 likes • 4d

@Lautaro Tapia In one case it was literally a case of a stolen identity. In the other two it was skill gaps. They interviewed well, likely because of their AI tool of choice, but when given a real task couldn't wrangle their assistant provided by the company to output what they needed to output.

Martin McFly

0 likes • 3d

@Ahmad Khan That is part of the panel interview. The final answer is a component, but more important is talking through why/how and if the candidate is using an LLM, getting there thoughts on the output provided. Arguably, that's more important, because it helps the panel see how the candidate thinks (or, if they do instead of just "Claude solve this <paste>"). I am very - very - anti-leetcode/puzzles in interviews. The technical discussions are designed to mimic real problems and show the work, yet still be something that can be answered in a 10-15 minutes so that the team can have a discussion with the candidate.

Diana Skipper Szyper

7d •

General Discussion 💬

Using AI safely?

What are your tips for using AI safely? I have created and tested plenty of custom Skills, Routines and custom Claude setups, but I'm really curious on what your practices are, on every level. How do you protect your data?

New comment 5d ago

Martin McFly

1 like • 6d

Treat prompts, skills, documents, etc that you get from anywhere you don't control as you would any code you'd pull from the internet. Don't immediately trust it because it's from someone well known. Supply chain attacks hit code bases frequently (and often in very loud ways). These occur on well known, popular, tools. There are ways to mitigate risks of code in most languages. The same isn't really true for shared AI skills, prompts, etc. yet. So, understand what it's doing before you let it run in your environment. You don't want your credentials going off your machine because you took and ran something within your AI Framework without checking it out. You (hopefully) don't do this with code, don't do it with your AI either.

Martin McFly

0 likes • 6d

@Diana Skipper Szyper Can you share how you've automated that? It's a difficult problem on the software side of things.

Nate Herk

💎

⭐

Oct '24 •

General Discussion 💬

Welcome! Introduce yourself + share a career goal you have 🎉

Let's get to know each other! Comment below sharing where you are in the world, a career goal you have, and something you like to do for fun. 😊

4.5k

75.2k

New comment just now

Martin McFly

7 likes • 7d

I'm Martin! Nice to be here. I've followed Nate on YouTube for a while now, and I'm starting a new role at my company in July. I'm looking forward to building some AI/agentic work flows in that role. The n8n tutorials were incredibly helpful in my current role, and allowed me to free up a lot of time so I could learn more about the "AI" side of things. So, I'm here to learn even more and hopefully take that into my new position.

Martin McFly

2 likes • 7d

@Nigel Vargas Thanks for the welcome!

Jace Freeman

8d •

General Discussion 💬

Are we still using N8N?

With how powerful Claude Code is, are we still building automations for clients using n8n, or is everything now being handled directly in Claude Code? What approach are you all taking?

New comment 4d ago

Martin McFly

1 like • 7d

Yes, I still use n8n for workflows that don't depend on true LLM capabilities. The playbooks that make life easier, but don't need a "brain" behind it. I've found my workflows that need that brain just sit in Claude/Codex skills and agents at this point.

1-5 of 5

Level 2 - Automation Novice 🛠️

6points to level up

Martin McFly

@martin-mcfly-1418

Learning to automate my business