Jack CalibratedAI

Automate What Academy

Activity

Mon

Wed

Fri

Sun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

What is this?

Less

Memberships

Early AI-dopters

1.3k members • $77/month

Automate What Academy

2.7k members • Free

7 contributions to Automate What Academy

Mason Anderson

4d •

🤖 AI

Where would faster local AI help you most?

This one feels like a big hint at where local AI is heading next. Google introduced DiffusionGemma, an experimental open model that generates text in parallel instead of one token at a time, making it up to 4x faster on dedicated GPUs. - Up to 4x faster text generation on GPUs - 1000+ tokens per second on a single NVIDIA H100 - 700+ tokens per second on an RTX 5090 - Built for low-latency local AI workflows - Generates 256-token blocks in parallel - Better fit for in-line editing, code infilling, and rapid iteration - Uses bi-directional attention, so tokens can see the whole block - Iterative self-correction while generating output - 26B MoE model, but only 3.8B active parameters during inference - Can fit in 18GB VRAM when quantized - Great signal for faster desktop AI agents and local automation tools - Not meant to beat Gemma 4 on quality yet - Speed vs quality trade-off is the big theme here Where do you guys think faster local text generation matters most: coding, agents, editing, support bots, or something else? I would say coding and support bots for me. Read the full article here: https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation

New comment 3d ago

Where would faster local AI help you most?

Jack CalibratedAI

2 likes • 3d

My guess is it’s best for voice agents. It still needs to be paired with a voice model, but it can deal with interruptions way better than a standard model.

Mason Anderson

4d •

🤖 AI

Claude Fable 5 is Here!

Anthropic just dropped Fable 5 — their first Mythos-class model available to the public. And if you're running long agents, doing multi-step research, or working with complex codebases, your ceiling just moved. Stripe migrated a 50-million-line Ruby codebase in a single day with Fable 5. That same job would take a team of human engineers two months. That's the kind of capability gap we're talking about here. The jobs you wrote off as too complex to automate? Those are back on the table. Go find one from your backlog, hand it to Fable 5, and see how far it gets — because Opus 4.8 probably couldn't finish it, and Fable 5 just might.

New comment 4d ago

Jack CalibratedAI

2 likes • 4d

It burns tokens twice as fast too.

Mason Anderson

7d •

💬 General

Who's watching?

Apple's WWDC 2026 is the company's annual Worldwide Developers Conference, taking place from June 8 to June 12, featuring the debut of iOS 27, a heavily revamped Siri powered by Google Gemini, and the final keynote presentation by outgoing CEO Tim Cook.

New comment 6d ago

Jack CalibratedAI

1 like • 6d

I always skip these and just get coverage from reporters or the shorten version on YouTube. It looks like they're finally able to deliver on some of the promises they been making for 2 years and catching up on some of the things that have been available for sometime from Google and ChatGPT.

Mason Anderson

11d •

🤖 AI

MiniMax M3

A startup just gave away a frontier-level coding AI for free. Kind of. MiniMax M3 beats GPT-5.5 and Gemini 3.1 Pro on coding. Lands right behind Claude Opus 4.7. And yes, you can use it today through their API for a fraction of what the big models charge. The catch? Downloading and self-hosting it costs $2,500+ in hardware minimum. But you don’t need that. Hit the API, skip the server bill, get the same power. Read more: https://www.minimax.io/blog/minimax-m3

New comment 9d ago

Jack CalibratedAI

1 like • 10d

I need to test this with Claude Code as the harness to see if it performs as well as Opus.

Jack CalibratedAI

0 likes • 9d

@Orestes Monteagudo Probably after June 15, depending on how much Anthropics caps their automated usage.

Orestes Monteagudo

18d •

🛠️ Problems & Solutions

What is the best way to handle "Service unavailable" for N8N's AI nodes

I’m building an n8n workflow that uses AI nodes, specifically Google Gemini and Claude, as part of a long SEO research and content brief generation process. The issue I’m trying to handle more effectively is temporary AI model failures like this: ``` Service unavailable - try again later or consider setting this node to retry automatically. This model is currently experiencing high demand. Spikes in demand are usually temporary. Please try again later. ``` The problem is that this error is usually temporary, but when it happens, it can break the whole workflow execution. Since the workflow has already completed several expensive/research-heavy steps before reaching the AI node, I lose a lot of progress and often have to re-run the entire workflow manually. The retry option is on but the max is 5 attemps every 5 seconds and some time this is not anough time for the service to restore... I’m currently using n8n self-hosted version 2.22.4. What I’m considering is this approach: 1. Enable **Continue On Error** on the AI node. 2. Send the error branch to a **Code node**. 3. In the Code node, detect whether the error is a temporary/retryable error, such as: * service unavailable * high demand * try again later * timeout * 429 / 503 errors 4. If the error is retryable and the retry count is below a maximum number of attempts, send the item to a **Wait node** and then back to the AI node. 5. If the error is not retryable, or the max number of attempts has been reached, either stop the workflow or mark the item as failed. 6. I’m also thinking about moving this logic into a reusable **sub-workflow**, so I can use the same “safe AI call” pattern across multiple AI nodes. My main question is: * How do experienced n8n users usually prevent losing progress when an AI node fails after several previous research steps have already completed? My goal is to make the workflow resilient, avoid repeating expensive research steps, and only retry the AI generation part when the error is temporary.

New comment 14d ago

Jack CalibratedAI

0 likes • 17d

I haven't built something like that in a bit, but I usually have it save things to Airtable, Google Sheets, or Pinecone, depending on what the data is for. Pinecone might be helpful in your case, but only a guess as I just spun up one recently specifically for an LLM to reference the data that I'm working with. Also, I'm not sure what model you're using, I haven't seen that kind of consistent outage like you're experiencing. I did run into something similar with Claude maybe over a month ago. It turns out it was an incorrect error in that it wasn't actually down, I just needed to increase wait between the API calls. Anyway, good luck on fixing your workflow!

Jack CalibratedAI

0 likes • 15d

You're using Gemini, so it definitely shouldn't be a service outage due to high demand (Google is more than capable of handling it). That error from your screenshots do look very similar to what I saw. How many API nodes do you have in this workflow? If you have more than one, I'm pretty sure it is because you're calling them too quickly.

1-7 of 7

Level 2 - Initiator

11points to level up

Jack CalibratedAI

@jack-calibratedai-6743

Solutions Architect by day, AI tinkerer by night. I run an AI consulting side hustle and I'm always looking for ways to automate the boring stuff.

Active 3h ago

Joined Apr 12, 2026

Contributions

Followers

Following