Hot take: most teams are solving prompt injection at the wrong layer.
Input filtering catches the obvious stuff. But it does nothing against indirect injection — a malicious instruction buried in a PDF your RAG pipeline just ingested. That payload never touches your input filter.
The real control surface isn't the prompt. It's what your model is allowed to do once it's already compromised.
Four things I'd argue matter more than input sanitization:
→ Tool allowlists instead of open function calling
→ Scoped credentials per agent action (not one super-key)
→ Human-in-the-loop on anything that mutates state
→ Output validation before results touch a downstream system
The mental model shift: assume the prompt is already compromised. Then design the blast radius.
Curious where everyone here lands on this — are you filtering inputs, constraining actions, or both? Drop your setup in the comments 👇
1
0 comments
Stephanie Macahis
3
Hot take: most teams are solving prompt injection at the wrong layer.
powered by
AI Cloud Security Lab
skool.com/security-builder-lab-2699
This group is closing June 25th, 2026. The Wazuh lab will remain free on GitHub.
Stay connected on LinkedIn: https://linkedin.com/in/joshbotz
Build your own community
Bring people together around your passion and get paid.
Powered by