Teaching robots what we want by watching them stumble
A clean reference implementation of three foundational inverse reinforcement learning algorithms, born from a university course on AI.

What it does
This repo implements three classic inverse reinforcement learning (IRL) algorithms in Python: linear programming IRL (Ng & Russell, 2000), maximum entropy IRL (Ziebart et al., 2008), and deep maximum entropy IRL (Wulfmeier et al., 2015). It also bundles two standard test environments—Gridworld and Objectworld—plus value iteration utilities. The goal of IRL is to work backwards: given observed behavior, recover the reward function that would have produced it.
The interesting bit
The code is coursework from a 2016 ANU class supervised by Marcus Hutter, and it includes a linked technical report with original derivations. That academic pedigree means the implementations hew closely to the papers rather than chasing benchmarks or PyTorch ports.
Key highlights
- Covers both small and large state-space variants of linear programming IRL
- Maximum entropy implementation includes state visitation frequency and feature expectation helpers
- Deep variant uses Theano (yes, Theano) for symbolic computation
- Self-contained: environments, solvers, and evaluation metrics in one package
- Zenodo DOI provided for proper citation
Caveats
- Theano dependency dates the project; modern users will need legacy tooling or migration
- No performance claims, tests, or continuous integration visible in the README
- README docstrings are thorough but the project appears unmaintained since 2016
Verdict
Grab this if you are teaching, learning, or reproducing classic IRL papers from first principles. Skip it if you need production-scale inverse RL or modern deep-learning infrastructure.