An npm package that makes your AI agents share their homework
It captures the messy reality of long-horizon agent tasks and turns execution traces into reusable, shareable learning signals.
What it does
Agent Apprenticeship is a Node.js CLI that wraps existing local agents—Codex, Cursor, Claude Code, OpenClaw, and others—in iterative mentorship loops. It runs tasks, packages the execution traces and any human or model feedback into reusable bundles, and lets you share or consume them via a public ecosystem backed by a GitHub repository. The goal is to stop throwing away the messy reality of long-horizon agent work and instead treat it as a compounding dataset.
The interesting bit
The project ships with a surprisingly substantive seed dataset—over 500 curated real-world tasks, nearly 500 lessons, and more than a thousand full execution traces—so it is bootstrapped with actual content rather than empty infrastructure. It also frames every task in economic terms, attempting to estimate the value of specialized agent labor, which is either a useful metric or a bold assumption depending on your cynicism level.
Key highlights
- Plays nice with agents you already have installed; auto-detects CLIs and offers custom command templates
- Three mentor modes: fully automated (
model-assisted), human checkpoint (expert-led), or hybrid draft-and-approve - Seed dataset spans specialized domains and includes 1000+ agent work episodes
- Experience Packs let you pull community contributions and inject them into new local runs
- Optional ecosystem sharing via GitHub CLI with manual, ask, or automatic contribution modes
Caveats
- The README describes “training signals” and ecosystem learning, but never clarifies whether this means fine-tuning, prompt augmentation, or simple trace replay—how agents technically get better is left unspecified
- Ecosystem sharing requires GitHub CLI to be installed and authenticated, adding a dependency beyond the npm package itself
Verdict
Useful if you want to archive, share, and reuse agent execution context across a team without building your own loop infrastructure. Less compelling if you are looking for a transparent training or fine-tuning framework, since the actual learning mechanism is unclear from the sources.