marlbenchmark/on-policy
Multi-Agent PPO (MAPPO) implementation for cooperative multi-agent reinforcement learning.

Velocity · 7d
+1.0
★ / day
Trend
→steady
star history
This repository provides the official implementation of MAPPO, a multi-agent variant of Proximal Policy Optimization. It implements the algorithm from the paper ‘The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games’ and is built on a PyTorch A2C-PPO-ACKTR foundation. The implementation supports training across multiple multi-agent environments including StarCraftII (SMAC/SMACv2), Hanabi, Multiagent Particle-World Environments, and Google Research Football.