← all repositories
danijar/dreamer

Reinforcement learning that daydreams its way to better policies

An agent that imagines futures in compressed feature space, then backpropagates through its own dreams to learn long-horizon control.

603 stars Python AgentsDomain Apps
dreamer
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does Dreamer is a reinforcement learning agent that learns a world model to predict future states in a compact latent space. Instead of planning in raw pixels or high-dimensional observations, it imagines trajectories in this compressed feature space, then derives a policy and value function from those imagined sequences. The implementation here is a clean, TensorFlow 2 rewrite of the original research code.

The interesting bit The clever part is backpropagating value gradients through multi-step imagined predictions — essentially computing credit assignment across futures that never actually happened. This lets the agent learn long-horizon behaviors without the sample inefficiency of model-free methods that must experience every possibility firsthand.

Key highlights

  • TensorFlow 2 implementation, positioned as “fast and simple” by the author
  • Targets DeepMind Control Suite tasks (e.g., dmc_walker_walk)
  • Generates training visualizations and GIFs via TensorBoard
  • Includes plotting utilities for analyzing learning curves
  • Original paper by Hafner et al. (2019); this is a reimplementation, not the official Google Research codebase

Caveats

  • Author notes DreamerV2 as the successor, with broader environment support (Atari + DMControl)
  • Pinned to older TensorFlow 2.2.0 and specific dependency versions; may need attention to run on modern stacks
  • README is minimal — no benchmark numbers, no training time estimates, no hardware requirements listed

Verdict Worth a look if you’re studying world models or need a readable TensorFlow 2 reference implementation of Dreamer. Skip if you want production-ready code or Atari support — the author explicitly points to DreamerV2 for that.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.