← all repositories
openai/maddpg

OpenAI's archived multi-agent RL code still sees use, with caveats

Reference implementation of MADDPG, a centralized-training-decentralized-execution algorithm for mixed cooperative-competitive environments.

2k stars Python AgentsML Frameworks
maddpg
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does

This is OpenAI’s official code for MADDPG (Multi-Agent Deep Deterministic Policy Gradient), an actor-critic reinforcement learning method where multiple agents learn simultaneously in environments that can be cooperative, competitive, or both. It is built specifically to pair with the Multi-Agent Particle Environments (MPE), a set of simple 2D physics-based scenarios for testing multi-agent behavior.

The interesting bit

The README is unusually honest: the codebase was restructured after publication, and results may differ from the original 2017 NIPS paper. The original policy ensemble and estimation code lives in a Dropbox zip, not the repo — a small archaeological dig if you need exact reproducibility.

Key highlights

  • Centralized training with decentralized execution: each agent’s critic sees all observations and actions, while each actor sees only its own
  • Supports mixing MADDPG and vanilla DDPG agents in the same environment (e.g., “good” agents vs. adversaries)
  • Command-line interface covers training, checkpointing, evaluation, and benchmarking without extra scaffolding
  • Core algorithm is ~4 files: maddpg.py, replay_buffer.py, plus TensorFlow utilities
  • 1,976 stars suggests it remains a common baseline, despite archive status

Caveats

  • Frozen in time: Python 3.5.4, TensorFlow 1.8.0, OpenAI Gym 0.10.5 — a dependency stack that will fight modern environments
  • No maintenance: Explicitly archived; “no updates expected”
  • Reproducibility gap: Restructured code + missing original ensemble implementation = paper numbers may not materialize

Verdict

Worth a look if you need a readable, cited baseline for multi-agent RL research or are comparing against MADDPG specifically. Skip it if you want production-ready code or a plug-and-play modern framework — this is a 2017 time capsule with sharp edges.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.