← all repositories
alexzhang13/rlm

Your LLM, now with recursion and a REPL

A Python library that turns language models into recursive engines by giving them a code sandbox and the ability to call themselves as functions.

rlm
Velocity · 7d
+26
★ / day
Trend
steady
star history

What it does

RLM swaps the standard llm.completion() call for rlm.completion(), handing the model a REPL environment where context lives as a variable and recursive sub-calls are just function invocations. The model can examine, decompose, and recursively process its own input programmatically rather than stuffing everything into a flat prompt. The library wraps standard API providers (OpenAI, Anthropic, OpenRouter, Portkey, vLLM) and plugs into several sandbox backends.

The interesting bit

The authors are betting against JSON tool-calling as the final form of agent interaction. Instead they argue for a CodeAct-style harness where the LLM writes code, manipulates context as objects, and spawns sub-LLM calls natively inside that code. It is a shift from “prompt engineering” to “letting the model engineer its own decomposition.”

Key highlights

  • Seven REPL environments: local exec, IPython (in-process or subprocess), Docker, plus cloud sandboxes via Modal, Prime Intellect, Daytona, and E2B.
  • Training harness included under training/, built on verifiers and prime-rl, with a worked OOLONG long-context QA example.
  • One-line install from PyPI (pip install rlms), Python 3.11+.
  • Already adopted downstream by DSPy, Ax, and others.

Caveats

  • The default local REPL runs exec in your own process and virtualenv; the README explicitly warns this is “generally safe” but “should not be used for production settings.”
  • Prime Intellect sandboxes are marked beta and currently slow, per the authors’ own note.
  • The training harness skips sandboxes “for simplicity,” which means you are trading safety for convenience if you use it as-is.

Verdict

Worth a look if you are building agentic systems and suspect that flat prompt-and-response loops are hitting a ceiling. Skip it if you need a hardened, production-ready orchestration layer today — the security model still depends heavily on which sandbox you configure.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.