OpenAI's agent framework: batteries included, lock-in optional
A Python SDK that wraps LLM orchestration in familiar abstractions—agents, handoffs, guardrails—while quietly letting you swap out the model backend.

What it does
The OpenAI Agents SDK gives you a structured way to build multi-agent workflows in Python. You define agents with instructions and tools, wire them together via handoffs or nested tool calls, and let a Runner handle execution. The SDK also manages conversation history through Sessions, validates inputs/outputs with Guardrails, and traces everything for debugging.
The interesting bit
Despite the OpenAI branding, the SDK is provider-agnostic. It supports the OpenAI Responses and Chat Completions APIs plus 100+ other LLMs via optional integrations like LiteLLM and any-llm. The “Sandbox Agent” concept (new in 0.14.0) is the less-hyped but more practical feature: agents that run inside a container with filesystem access, meant for actual work like inspecting repos, running commands, or applying patches across longer tasks.
Key highlights
- Handoffs and agents-as-tools: Delegate between agents either by passing control or by calling one agent as a tool from another
- Guardrails and human-in-the-loop: Built-in safety checks and explicit human approval points, not afterthoughts
- Tracing UI: Visual debugging of agent runs shown in the screenshot below
- Realtime voice agents: Support for
gpt-realtime-2with the full agent feature set - Optional extras: Voice support, Redis-backed sessions, and broad LLM compatibility via optional dependencies
Caveats
- Requires Python 3.10+; no mention of async-first design despite the concurrency-heavy domain
- The “100+ other LLMs” claim depends on external integrations (LiteLLM, any-llM); native support details are unclear from the README
- Sandbox agents currently show a Unix-local client in examples; container portability isn’t demonstrated
Verdict
Worth a look if you want structured agent orchestration without committing to a single model provider. Skip it if you need deep custom control over the execution loop or if you’re already happy with a lighter wrapper like raw LangChain or a hand-rolled state machine.