← all repositories
Stable-Baselines-Team/stable-baselines3-contrib

Stable-Baselines3's messy garage: where experimental RL lives

A holding pen for reinforcement learning algorithms too fresh or too niche for the main library.

720 stars Python ML FrameworksAgents
stable-baselines3-contrib
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

SB3-Contrib is the unofficial annex to Stable-Baselines3. It houses RL algorithms and tools that aren’t ready for—or don’t fit—the polished core library. Think of it as a researcher’s workshop with the same API conventions but looser admission standards.

The interesting bit

The project explicitly embraces mess. The maintainers admit these utilities are “too niche” or “too difficult to integrate well” into the main codebase, yet they still enforce documentation and code style. It’s a rare admission that not everything needs to be production-grade to be worth sharing.

Key highlights

  • Seven RL algorithms including ARS, QR-DQN, MaskablePPO, RecurrentPPO, TQC, TRPO, and CrossQ
  • One Gym wrapper: Time Feature Wrapper
  • Same API patterns as Stable-Baselines3, so swapping between main and contrib is mostly painless
  • Requires the master version of the main SB3 library, not just the PyPI release
  • Active CI and black code formatting, despite the “experimental” label

Caveats

  • “Almost everything remotely useful goes” — the bar is deliberately low, so quality varies
  • Experimental status means APIs may shift without the stability guarantees of the main library

Verdict

Worth a look if you’re reproducing a recent RL paper or need an oddball algorithm like invalid-action masking. Skip it if you want battle-tested defaults; the main Stable-Baselines3 repo already covers PPO, SAC, DQN, and friends.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.