Your LLM, now with recursion and a REPL
A Python library that turns language models into recursive engines by giving them a code sandbox and the ability to call themselves as functions.

What it does
RLM swaps the standard llm.completion() call for rlm.completion(), handing the model a REPL environment where context lives as a variable and recursive sub-calls are just function invocations. The model can examine, decompose, and recursively process its own input programmatically rather than stuffing everything into a flat prompt. The library wraps standard API providers (OpenAI, Anthropic, OpenRouter, Portkey, vLLM) and plugs into several sandbox backends.
The interesting bit
The authors are betting against JSON tool-calling as the final form of agent interaction. Instead they argue for a CodeAct-style harness where the LLM writes code, manipulates context as objects, and spawns sub-LLM calls natively inside that code. It is a shift from “prompt engineering” to “letting the model engineer its own decomposition.”
Key highlights
- Seven REPL environments: local
exec, IPython (in-process or subprocess), Docker, plus cloud sandboxes via Modal, Prime Intellect, Daytona, and E2B. - Training harness included under
training/, built onverifiersandprime-rl, with a worked OOLONG long-context QA example. - One-line install from PyPI (
pip install rlms), Python 3.11+. - Already adopted downstream by DSPy, Ax, and others.
Caveats
- The default local REPL runs
execin your own process and virtualenv; the README explicitly warns this is “generally safe” but “should not be used for production settings.” - Prime Intellect sandboxes are marked beta and currently slow, per the authors’ own note.
- The training harness skips sandboxes “for simplicity,” which means you are trading safety for convenience if you use it as-is.
Verdict
Worth a look if you are building agentic systems and suspect that flat prompt-and-response loops are hitting a ceiling. Skip it if you need a hardened, production-ready orchestration layer today — the security model still depends heavily on which sandbox you configure.