A PyTorch toolbox that swaps model-based RL parts like Lego bricks
Facebook Research's library aims to cut model-based reinforcement learning algorithms down to a few lines of interchangeable code.

What it does
MBRL-Lib is a PyTorch-based toolbox for building model-based reinforcement learning algorithms. It provides swappable modeling and planning components plus utility functions so you can write MBRL methods with minimal boilerplate. It ships with reference implementations of PETS, MBPO, and PlaNet, all configured through Facebook’s Hydra framework.
The interesting bit
The library treats the tedious part — wiring dynamics models, planners, and environments together — as the main event. The diagnostics toolkit is unusually thorough: a visualizer that plots model predictions against reality with uncertainty bands, a dataset evaluator for scatter-plotting prediction accuracy, and even a multi-CPU controller that farms trajectory optimization across cores. There’s also a PyQt5 training browser for comparing runs, which feels like the kind of quality-of-life feature most research code skips.
Key highlights
- Reference implementations of PETS, MBPO, and PlaNet with Hydra configs and tuned hyperparameter overrides
- Compatible with any Gymnasium-syntax environment; tested on MuJoCo, DMControl, and PyBullet-Gym
- Diagnostics suite: prediction visualizer, dataset evaluator, fine-tuner, multi-CPU controller, and training browser
- Tutorial notebook walking through PETS on continuous cartpole
- Community extensions including HuggingFace Hub integration and trajectory-based dynamics models
Caveats
- Most diagnostics tools require MuJoCo and only support
OneDimTransitionRewardModelmodels; broader support is planned but not yet implemented - DMControl and PyBullet-Gym need
gym==0.26.3, a pinned older version - The README notes Python 3.8+ in one place and shows a 3.7+ badge in another
Verdict
Worth a look if you’re doing MBRL research and tired of reimplementing the same infrastructure. Probably overkill if you just want to run a single algorithm out of the box — the value is in the modularity, not the novelty of the included methods.