Is PromptInject open source?

Yes — agencyenterprise/PromptInject is open source, released under the MIT license.

What language is PromptInject written in?

agencyenterprise/PromptInject is primarily written in Python.

How popular is PromptInject?

agencyenterprise/PromptInject has 510 stars on GitHub.

Where can I find PromptInject?

agencyenterprise/PromptInject is on GitHub at https://github.com/agencyenterprise/PromptInject.

← all repositories

agencyenterprise/PromptInject

Red-teaming LLMs with modular prompt attacks

A Python framework that assembles adversarial prompts to quantify how easily LLMs abandon their original instructions.

★510 stars Python LLMOps · Eval Language Models

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

PromptInject is a Python research framework that builds adversarial prompts from modular components to stress-test LLMs. It specifically probes two failure modes: goal hijacking, where user input overrides the system task to force a specific output, and prompt leaking, where the input tricks the model into revealing its own hidden instructions. The project is the code companion to a paper that won Best Paper at the NeurIPS ML Safety Workshop 2022.

The interesting bit

Instead of treating prompt injection as a black art, the authors formalize it as “mask-based iterative adversarial prompt composition”—essentially turning attacks into repeatable, composable templates. The README’s single diagram is worth the scroll: it shows your application prompt on the left and three possible trajectories on the right, making the threat model viscerally obvious.

Key highlights

Won Best Paper at NeurIPS ML Safety Workshop 2022
Two concrete attack vectors: goal hijacking and prompt leaking
Approaches adversarial prompts as composable, modular pieces rather than one-off hacks
Explicitly tested against GPT-3 in the published work
Backed by a full arXiv paper with BibTeX citation provided

Caveats

The README is minimal: beyond the abstract and one diagram, you are pointed to a notebook for any implementation details
The visible sources focus specifically on GPT-3; behavior against newer instruction-tuned models is not shown
The repo description promises “quantitative analysis,” but the README never elaborates on what metrics or scores the framework outputs

Verdict

Worth a look for researchers and red-teamers who want a structured, paper-backed approach to testing prompt injection. Not the right tool if you need a polished defense library with extensive documentation; the README essentially hands you a paper and a notebook and wishes you luck.

Frequently asked

What is agencyenterprise/PromptInject?: A Python framework that assembles adversarial prompts to quantify how easily LLMs abandon their original instructions.
Is PromptInject open source?: Yes — agencyenterprise/PromptInject is open source, released under the MIT license.
What language is PromptInject written in?: agencyenterprise/PromptInject is primarily written in Python.
How popular is PromptInject?: agencyenterprise/PromptInject has 510 stars on GitHub.
Where can I find PromptInject?: agencyenterprise/PromptInject is on GitHub at https://github.com/agencyenterprise/PromptInject.