ShangtongZhang/reinforcement-learning-an-introduction
A Python implementation of algorithms from the Sutton & Barto reinforcement learning textbook, covering bandits, dynamic programming, and temporal-difference learning.

Velocity · 7d
+4.1
★ / day
Trend
→steady
star history
This repository provides Python implementations of algorithms from the classic reinforcement learning textbook by Sutton and Barto. It reproduces figures and exercises from the book, covering multi-armed bandits, grid-world environments, dynamic programming, and temporal-difference learning methods. The code is structured by chapter and designed for educational use.