Reinforcement learning with training wheels
A readable reimplementation of guided policy search for robotic control, aimed at researchers who want to understand the machinery rather than just run it.

What it does
This is a Python reimplementation of guided policy search (GPS) and LQG-based trajectory optimization, techniques for teaching robots to perform tasks through reinforcement learning. The code targets researchers who want to dissect and extend prior work rather than treat it as a black box.
The interesting bit
The authors explicitly label this “a work in progress” — unusual honesty in academic code drops. The project prioritizes pedagogical clarity over polished packaging, which is either refreshing or frustrating depending on your deadline.
Key highlights
- Reimplements GPS and LQG trajectory optimization from scratch in Python
- Explicitly designed for understanding, reuse, and extension
- Full documentation lives externally at rll.berkeley.edu/gps
- FAQ page outlines planned future additions (suggesting active but incomplete development)
- 599 stars suggests niche but sustained interest in the robotics/RL community
Caveats
- README is extremely sparse; most detail lives off-repo at the Berkeley site
- “Work in progress” warning means APIs and features may shift
- No candidate images provided, suggesting minimal visual polish or screenshots
Verdict
Worth a look if you’re doing robotics RL research and need to hack on (not just run) guided policy search. Skip if you need a batteries-included, production-ready framework — this is a lab codebase wearing its academic origins openly.