AI agents that hack, with receipts
A Python framework for building offensive and defensive security agents that has already found real bugs in robots, heat pumps, and e-commerce platforms.

What it does
CAI is a lightweight Python framework for wiring LLMs into cybersecurity workflows — reconnaissance, exploitation, vulnerability discovery, and mitigation. It supports 300+ models (OpenAI, Anthropic, DeepSeek, Ollama, etc.) and ships with built-in security tools and guardrails against prompt injection and dangerous command execution. The community edition is free for research; a €350/month “PRO” tier offers unlimited access to their own alias1 model.
The interesting bit
The project backs up its claims with documented case studies: they found exposed RSA keys and GDPR violations in Unitree G1 humanoid robots, ranked top-10 in the Dragos OT CTF 2025, and inspired HackerOne’s production AI deduplication agent. This isn’t theoretical — the README is essentially a bug-bounty portfolio.
Key highlights
- Modular agent-based architecture for specialized security tasks
- Built-in defenses against prompt injection and reckless command execution
- Cross-platform: Linux, macOS, Windows, and Android
- Active research program with multiple arXiv papers (2504.06017, 2506.23592, and others)
- Community edition installable via
pip install cai-framework
Caveats
- The “beats GPT-5 in CTF benchmarks” claim is sourced to their own benchmarking page, not an independent evaluator
- Heavy commercial push for the PRO tier throughout the README; the open-source edition’s limitations aren’t clearly specified
- Some case studies link to external commercial pages rather than detailed technical writeups
Verdict
Worth a look if you’re building security automation or researching AI-driven pentesting. Skip it if you need a mature, vendor-neutral framework without upsell friction — the open-source/community boundary here is deliberately blurry.