← all repositories
aliasrobotics/cai

AI agents that hack, with receipts

A Python framework for building offensive and defensive security agents that has already found real bugs in robots, heat pumps, and e-commerce platforms.

8.9k stars Python Domain AppsAgents
cai
Velocity · 7d
+21
★ / day
Trend
steady
star history

What it does

CAI is a lightweight Python framework for wiring LLMs into cybersecurity workflows — reconnaissance, exploitation, vulnerability discovery, and mitigation. It supports 300+ models (OpenAI, Anthropic, DeepSeek, Ollama, etc.) and ships with built-in security tools and guardrails against prompt injection and dangerous command execution. The community edition is free for research; a €350/month “PRO” tier offers unlimited access to their own alias1 model.

The interesting bit

The project backs up its claims with documented case studies: they found exposed RSA keys and GDPR violations in Unitree G1 humanoid robots, ranked top-10 in the Dragos OT CTF 2025, and inspired HackerOne’s production AI deduplication agent. This isn’t theoretical — the README is essentially a bug-bounty portfolio.

Key highlights

  • Modular agent-based architecture for specialized security tasks
  • Built-in defenses against prompt injection and reckless command execution
  • Cross-platform: Linux, macOS, Windows, and Android
  • Active research program with multiple arXiv papers (2504.06017, 2506.23592, and others)
  • Community edition installable via pip install cai-framework

Caveats

  • The “beats GPT-5 in CTF benchmarks” claim is sourced to their own benchmarking page, not an independent evaluator
  • Heavy commercial push for the PRO tier throughout the README; the open-source edition’s limitations aren’t clearly specified
  • Some case studies link to external commercial pages rather than detailed technical writeups

Verdict

Worth a look if you’re building security automation or researching AI-driven pentesting. Skip it if you need a mature, vendor-neutral framework without upsell friction — the open-source/community boundary here is deliberately blurry.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.