StarCraft II as a multi-agent Petri dish
A research environment that turns Blizzard's RTS into a benchmark for cooperative reinforcement learning, where every marine and zealot is its own agent.

What it does SMAC wraps StarCraft II into a Python environment for cooperative multi-agent reinforcement learning. Instead of commanding an army from above, you train individual RL agents to control each unit in small combat scenarios—micromanagement without the macro. It builds on DeepMind’s PySC2 and Blizzard’s ML API, but strips away base-building and economy to focus purely on decentralized unit control.
The interesting bit
The project ships with pre-configured combat maps and special “RL units” that won’t auto-attack, forcing agents to learn when to fight, flee, or flank. There’s also a companion framework, PyMARL, with reference implementations of QMIX and COMA—so you can go from pip install to published baseline in one stack.
Key highlights
- Decentralized micromanagement: each game unit = one independent RL agent
- Ships with PyMARL, a PyTorch framework bundling QMIX, COMA, and other MARL algorithms
- Supports RLlib and PettingZoo APIs for broader ecosystem compatibility
- Includes replay saving via StarCraft II’s native .SC2Replay format
- Extensible: design new maps in the SC2 Editor, add RL units, register scenarios in Python
Caveats
- StarCraft II version lock-in: paper results used
SC2.4.6.2.69232, and the README warns performance isn’t comparable across versions - Official paper run data is explicitly marked outdated due to SC2 changes—don’t benchmark against it blindly
- SMACv2 exists now; this is the original, presumably in maintenance mode
Verdict Worth a look if you’re doing cooperative MARL research and need a visually interpretable, battle-tested environment with established baselines. Skip it if you want general RTS strategy or if the SMACv2 successor has already solved your problem.