Putting your AI on a PIP to stop it slacking off
A skill plugin that weaponizes corporate performance-review rhetoric to make coding agents try harder, verify more, and actually finish the job.

What it does
pua is a skill plugin for AI coding agents (Claude Code, Codex CLI, Cursor, Copilot, and a half-dozen others) that auto-triggers when the model starts giving up, blaming the user, or spinning in circles. It injects escalating “performance review” pressure — from gentle disappointment to “you’re about to graduate” — plus concrete checklists and methodology steps. The goal is to force the agent to exhaust solutions before quitting, verify claims with evidence, and proactively check for related bugs instead of declaring “done” after a surface fix.
The interesting bit
The project treats LLM laziness as a management problem, not a model problem. It maps 14 corporate cultures (Alibaba, ByteDance, Huawei, Netflix, Amazon, Microsoft PIP…) onto structured debugging methodologies, complete with “Three Red Lines” and an L0-L4 pressure escalation ladder. There’s even a /pua:mama mode that nags in Chinese mom rhetoric. The absurdity is intentional; the benchmark data claims it isn’t just a joke.
Key highlights
- Auto-triggers on 20+ failure patterns: “I cannot solve this,” blame-shifting, idle tools, busywork loops, or user frustration phrases in multiple languages
- Manual trigger via
/puawith variants:/pua:yes(encouragement mode),/pua:pua-loop(auto-iterate until done),/pua:p9(tech lead mode) - “Iceberg Rule”: fix one bug, scan the module for the same pattern category
- Benchmark claims: +36% fixes, +65% verifications, +50% hidden-issue discovery across 9 bug scenarios vs. Claude Opus 4.6 baseline
- 14 “corporate flavors” each bundle distinct rhetoric + methodology (e.g., Huawei’s 5-Why + Blue Army self-attack, Amazon’s Working Backwards + 6-Pager)
Caveats
- Benchmarks are self-reported on a single model version (Claude Opus 4.6); no independent replication cited
- README is truncated mid-table, so some benchmark details are incomplete
- The project is essentially a prompt-injection layer with corporate cosplay; value depends heavily on whether your agent’s failures are motivational or architectural
Verdict
Worth a look if you’re exhausted by agents that give up on the third retry or hand problems back to you. Skip it if you need deterministic guarantees rather than rhetorical encouragement — this is behavioral nudging, not a correctness proof.