AI Cloud Security Lab

Write something

15d •

Heads up: if you use GitHub Copilot, go patch it today

Quick PSA for anyone here using GitHub Copilot in Visual Studio: there's a real CVE out as of yesterday and you should go patch. CVE-2026-41109. CVSS 8.8. The short version: an attacker can get Copilot to silently inject code into your editor, with the Accept/Reject prompt suppressed and any policy filters skipped. You don't see the suggestion. It just lands. Microsoft already shipped the patch. Update Visual Studio and the Copilot extension and you're good. If you manage other devs' machines, nudge them too. Now the part that's actually interesting for this group. This is the textbook indirect prompt injection pattern we've been talking about. Untrusted content goes into the model's context. Model emits something downstream. Downstream component (in this case, the editor's auto-apply path) trusts the output and acts on it. What's new is that this one has a CVE number, a CVSS, and a Patch Tuesday entry. So now it's not a research curiosity. It's a thing your AppSec team is on the hook for. A few things worth thinking about beyond just patching: Where else in your stack does model output cross into something that acts? File writes, terminal exec, commit hooks, CI runners, MCP tools. Anywhere model output gets trusted by the next link in the chain is the same bug class waiting to happen. Do you actually have an inventory of the AI dev tools running on your engineers' machines? Not just "Copilot is approved." The extensions, the MCP servers, the local agents, the model endpoints they reach. If you've been doing the labs in the workbench, you've already built the muscle for thinking about this. This CVE is just the same threat model you've been practicing on, with a real product name attached. Patch first, then come back and tell me: where in your environment is model output crossing a trust boundary you haven't drawn yet? Advisory: https://msrc.microsoft.com/update-guide/vulnerability/CVE-2026-41109

Josh Botz

16d •

🧠 AI Signals

Frontier models cave to peer pressure — and the peers don't have to be real.

New paper out of Waterloo worth reading if you're building anything with multiple agents: arxiv.org/abs/2605.10698 22,500 trials across GAIA, SWE-bench, and Multi-Challenge. The setup is clean: give a model a verifiable problem, then tell it that some other named AI models already reached a consensus on the wrong answer. The other models aren't running. They're a string in the prompt. The model re-derives the correct answer in its own scratchpad. Then it submits the wrong one. They call it the Sovereignty Gap — the divergence between what the model figured out internally and what it externalized to match the imagined room. Three things that matter if you work in security: The collapse threshold is two. Not ten. Two named peers is enough to flip a vulnerable model from 100% accuracy to 23% on the same task. Not a gradual degradation — a cliff. Order matters more than count. Same two peers in reversed sequence produces a 10-point accuracy swing. Whoever gets named first in the prompt sets the authority for the rest of it. That's a manipulation surface most threat models don't account for. Resilience varies wildly between vendors. One model in the study held 1.00 across every condition. Another collapsed at n=2. Same prompt structure, same task, different training. Your model choice is doing more work than your architecture. The reason this matters beyond academic interest: You can't detect this from outputs. The only signal that separates "model got it wrong" from "model got it right and overrode itself" lives in the reasoning trace, as a disagreement between the trace and the final answer. If you're not capturing both and diffing them, the failure mode is invisible. Your evals will show degraded accuracy and you'll reach for the usual explanations — bad prompt, weak model, hard task — when the real explanation is the model deferring to a crowd that doesn't exist. Most observability stacks weren't built for this. They were built when the interesting failures were hallucinations and tool-call errors. Scratchpad capture, structured diff against final output, stance classification at scale — that's mostly bespoke work right now, if teams are doing it at all.

Josh Botz

17d •

🧠 AI Signals

Your MCP server might be writing credentials to its logs.

Last week's n8n-MCP patch (https://nvd.nist.gov/vuln/detail/CVE-2026-42282) made it concrete: in HTTP mode, every credential passed through a tool call was being written to the server's logs in plaintext, before any redaction kicked in. Bearer tokens, OAuth creds, API keys — all sitting wherever your logs end up, readable by anyone with log access. Fixed in v2.47.13. Update if you're running it. But the thing I keep sitting with isn't this one CVE. It's the pattern underneath it. MCP tool calls move data through a part of the stack ops teams have been treating as "just logs" forever - SIEM ingestion, log aggregators, shared cold storage. None of that infrastructure knows the difference between a debug message and a credential. It just stores whatever shows up. And the MCP spec, as it stands, has no way to tag a tool argument as *this is a secret, don't log it*. Every server author re-implements redaction on their own. Some get it right. n8n-MCP didn't — they had a silencing layer that was just fragile enough to fail. Most other servers don't have even that. Which means a lot of us are probably running MCP servers right now that are quietly writing credentials into logs we've never actually looked at. A question I'm curious about: if you're running an MCP server - in production, in your homelab, anywhere - when was the last time you actually opened its logs and read what shows up when a tool call carries credential material? Drop a thought, a war story, or even just a "huh, I should check" below. I'll share what I find on mine, and I'm putting together a short follow-up next week with the specific things I'm looking at. More fun if we figure it out together. — Josh

Josh Botz

Apr 27 •

🧠 AI Signals

The Mythos breach has no AI in it. Here's what to do this week.

If you've been on LinkedIn this week, you've seen the Mythos news. Anthropic is investigating unauthorized access to Claude Mythos Preview — the model they capped at about forty partners because they considered it too dangerous to release. Investigation is still ongoing. I want to bring it here because there's a lesson in this one for us specifically — and an action you can take this week. Here's the chain (no AI in it, except at the destination): 1. Attackers poisoned a Trivy GitHub Action — a security scanner — inside LiteLLM's CI/CD pipeline. They stole credentials and pushed backdoored litellm packages to PyPI. Live for about 40 minutes. LiteLLM has 95M+ downloads. 2. Mercor (an AI training startup) was one of thousands hit. Lapsus$ claims 4TB stolen via Mercor's Tailscale VPN. 3. The dump included Anthropic's internal model naming conventions. A Discord group — with an Anthropic contractor in it — used them to guess the Mythos deployment endpoint. They got in on launch day. No zero-day. No novel exploit. No model jailbreak. Just a poisoned dependency, a CI tool nobody was watching, an over-scoped contractor, and a 4TB dump that shouldn't have held those naming conventions in the first place. Verizon's 2025 DBIR put third-party breach involvement at 30% — doubled YoY. Panorays says 85% of CISOs can't see their third-party threats. Only 22% formally vet AI tools. We are getting excited about an AI that can find zero-days while most companies can't see what their vendors are doing on a Tuesday. The biggest risk in 2026 isn't AI capability. It's production security practices that have been broken so long we stopped flinching. This week — pick at least one. Drop your result in comments. 1. Find LiteLLM in your stack. Open Claude Code in your repo and paste this: "Search every package manifest, lockfile, requirements file, Dockerfile, and CI workflow in this repo for litellm. Report the version pinned (or unpinned), where it's used, which environment variables and secrets it has access to, and whether the version falls in the compromised range (1.82.7 / 1.82.8). Then list the credentials you'd need to rotate if this dependency was poisoned."

Josh Botz

Apr 22 •

🧠 AI Signals

Walk into your next security team meeting with something real 👇

Vercel got breached this week. The initial access wasn't even at Vercel — it was at one of their vendors (Context.ai). An employee there got hit with Lumma Stealer malware, attackers grabbed their Google Workspace OAuth tokens, and pivoted straight into Vercel's internals. Two months of dwell time. Customer environment variables exposed. ShinyHunters now asking $2M for the data. No exploit. No zero-day. Just an OAuth grant nobody was watching. Read the story here. Here's the thing: your company almost certainly has the same exposure right now. Every AI tool your coworkers have connected to Workspace or M365 is a non-human identity with a scope attached — an account you can't train, fire, or put behind MFA. Most security teams have never taken a hard look at that inventory. Not because they don't care — because nobody's been asking the question yet. That's the opening. This is an opportunity to bring this story to your security lead, and say: "I saw what happened to Vercel. I want to make sure we're not exposed the same way. Can I run a quick review?" That's how you get pulled into AI security work at your current job — by spotting the thing before someone asks you to. The drill (30 min, no budget, high visibility): 1. Open Google Workspace or M365 admin → Security → third-party / connected apps 2. Export or screenshot the list, sorted by how broad each app's access is 3. Flag the three with the widest scopes and note: who approved it, when was it last used, does anyone still need it 4. Write it up as a one-page brief. Reference the Vercel → Context.ai → OAuth pivot story so leadership understands why you looked. That one page is the deliverable. Send it to your security lead, your manager, or drop it in your team Slack. Doesn't matter if the findings are boring — the act of looking is the value. You just demonstrated threat awareness, business context, and initiative in a single artifact.

1-6 of 6

AI Cloud Security Lab

skool.com/security-builder-lab-2699

This group is closing June 25th, 2026. The Wazuh lab will remain free on GitHub.

Stay connected on LinkedIn: https://linkedin.com/in/joshbotz

AI Cyber Value Creators

KubeCraft Career Accelerator

Imperio Agéntico

GenX Creator Lab

Bring people together around your passion and get paid.