PentesterFlow/agent

A terminal agent that asks before it attacks

PentesterFlow pairs LLMs with real pentesting tools in your terminal, demanding human approval for sensitive actions and evidence before it reports a finding.

★504 stars TypeScript Agents Domain Apps

View on GitHub ↗ Homepage ↗

Collecting fresh signals — velocity needs a few days of history.

collecting data…

star history

What it does

PentesterFlow is a terminal assistant that orchestrates LLMs through the full pentest lifecycle—recon, enumeration, validation, and reporting—while insisting on human approval before running shell commands or filing findings. It connects to local or remote models and wraps pentesting tools behind a permission-gated agent loop. Every confirmed finding is backed by reproducible request/response evidence written to Markdown, and sessions can be resumed without losing context.

The interesting bit

The project treats hallucinated vulnerabilities as a first-class problem: it forces the agent to reproduce issues with confirm_finding before they hit the report, and it silently injects lessons from past engagements via a local memory store without retraining model weights or bloating the prompt. That combination of auditability and operational learning is rare in agentic AI tools.

Key highlights

Human-in-the-loop execution with allow-once, session, or explicit YOLO mode for labs.
Built-in skills for specific attack classes: SSRF, SSTI, JWT, GraphQL, race conditions, and takeover scenarios.
Local continuous learning via project and personal scenarios.jsonl files; secrets are redacted before storage.
Broad LLM backend support including Ollama, LM Studio, Groq, Gemini, Kimi, and OpenAI-compatible APIs.
Burp Suite integration via a companion extension for two-way traffic and findings sync.

Caveats

The README warns explicitly that the agent can run shell commands and HTTP requests, so it is strictly for authorized targets only; there is no sandboxing detail beyond the approval gate.
Some provider-specific workarounds—Groq compaction thresholds and LM Studio stop-token trimming—suggest rough edges during long assessments on certain backends.

Verdict

Security engineers and bug bounty hunters who want an AI assistant that stays obedient and audit-friendly should look here; developers seeking a generic coding agent or fully autonomous hacker will find the permission gates tedious.