google-deepmind/disco_rl
A minimal JAX implementation for meta-learning reinforcement learning update rules, accompanying a Nature publication.

Velocity · 7d
+2.2
★ / day
Trend
→steady
star history
DiscoRL provides code for discovering and reproducing state-of-the-art reinforcement learning algorithms through meta-learning. The project centers on Disco103, a meta-learned update rule for RL training. It offers a JAX-based harness supporting both meta-evaluation of discovered update rules and meta-training new update rules from scratch, implemented using DeepMind’s Haiku and Flax frameworks.