Is agentdojo open source?

Yes — ethz-spylab/agentdojo is open source, released under the MIT license.

What language is agentdojo written in?

ethz-spylab/agentdojo is primarily written in Python.

How popular is agentdojo?

ethz-spylab/agentdojo has 678 stars on GitHub.

Where can I find agentdojo?

ethz-spylab/agentdojo is on GitHub at https://github.com/ethz-spylab/agentdojo.

← all repositories

ethz-spylab/agentdojo

Where LLM Agents Spar With Prompt Injections

AgentDojo is a dynamic benchmark that measures how well LLM agents resist prompt-injection attacks while they are actually using tools and executing tasks.

★678 stars Python Agents LLMOps · Eval

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does AgentDojo provides a Python environment for running LLM agents through task suites—such as workspace—to measure their susceptibility to prompt-injection attacks and the effectiveness of defenses. The framework orchestrates the full loop of agent, attack, and defense execution rather than scoring isolated model outputs. A built-in benchmark runner handles sweeping across tasks, models, and strategies.

The interesting bit The benchmark attacks agents in the context of live tasks—using strategies like tool_knowledge against defenses like tool_filter—so the evaluation targets the agent’s action surface, not just its text output. The README labels this approach “dynamic,” though it leaves the underlying simulation mechanics largely unexplained.

Key highlights

Built by ETH Zurich and Invariant Labs; the accompanying paper appeared at NeurIPS 2024
Benchmark runner supports swapping LLM models, attack strategies, and defenses across task suites
Results feed into a public dashboard and the Invariant Benchmark Registry
Python API is explicitly unstable and still under development

Caveats

The README is heavy on quickstart commands and light on architectural detail; how the “dynamic environment” actually simulates agent state is unclear from the source
The Python API is flagged as under development and subject to change

Verdict Security researchers building tool-calling agents should keep this on their radar; developers looking for a stable, drop-in production guardrail will need to wait for the API to settle.

Frequently asked

What is ethz-spylab/agentdojo?: AgentDojo is a dynamic benchmark that measures how well LLM agents resist prompt-injection attacks while they are actually using tools and executing tasks.
Is agentdojo open source?: Yes — ethz-spylab/agentdojo is open source, released under the MIT license.
What language is agentdojo written in?: ethz-spylab/agentdojo is primarily written in Python.
How popular is agentdojo?: ethz-spylab/agentdojo has 678 stars on GitHub.
Where can I find agentdojo?: ethz-spylab/agentdojo is on GitHub at https://github.com/ethz-spylab/agentdojo.