opendilab/LightZero
A unified benchmark for Monte Carlo Tree Search algorithms in reinforcement learning, implementing AlphaZero, MuZero, and variants for games and control tasks.

LightZero provides PyTorch implementations of MCTS-based reinforcement learning algorithms including AlphaZero, MuZero, EfficientZero, Gumbel-MuZero, and Stochastic MuZero. It benchmarks these agents across diverse scenarios such as board games (Gomoku, TicTacToe), Atari environments, and continuous control tasks. The project is designed as both a research benchmark and a training framework for self-play decision-making agents.