← all repositories
araffin/rl-baselines-zoo

A museum of 120+ trained RL agents, now closed for renovation

Pre-trained Stable Baselines agents with tuned hyperparameters, though the zoo has moved to a new location.

1.2k stars Python ML FrameworksAgents
rl-baselines-zoo
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

This repository houses over 120 pre-trained reinforcement learning agents across Atari, classic control, Box2D, PyBullet, and MiniGrid environments. Each agent comes with tuned hyperparameters stored in YAML files, plus scripts to train new agents, watch existing ones (enjoy.py), or optimize hyperparameters via Optuna.

The interesting bit

The real value isn’t the agents themselves—it’s the hyperparameter matrix. The README documents which algorithm-environment pairs actually work (marked with ✓) versus which ones nobody has bothered to train yet. It’s a pragmatic map of where Stable Baselines succeeds and where it quietly fails.

Key highlights

  • 120+ trained agents with benchmarked scores in benchmark.md
  • Hyperparameter search via Optuna (though not for ACER or DQN)
  • Support for environment wrappers and custom kwargs without code changes
  • Docker images and a Colab notebook for immediate experimentation
  • Video recording utility for agent demos

Caveats

  • Explicitly unmaintained: the README banner redirects users to RL-Baselines3 Zoo, which uses Stable-Baselines3 instead
  • Some algorithm-environment combinations are simply empty (TRPO on all Atari, most MiniGrid with non-PPO algorithms)
  • PyBullet environments are noted as “much harder than the MuJoCo version” due to derivation from Roboschool

Verdict

Worth browsing if you’re stuck debugging why your PPO won’t learn on BipedalWalkerHardcore, or if you need historical baselines for comparison. Otherwise, follow the README’s own advice and head to the newer RL-Baselines3 Zoo.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.