A C++ reinforcement-learning library that won't make you write Python in C++
AI-Toolbox wraps MDP and POMDP solvers in readable C++ with Python bindings, borrowing from the classic Matlab MDPToolbox and Cassandra's pomdp-solve.

What it does
AI-Toolbox implements Markov Decision Processes and Partially Observable MDPs in C++, with a full suite of solvers from Value Iteration to Monte Carlo Tree Search. It also ships Python bindings that mirror the C++ API almost line-for-line, plus the ability to drop in native Python generative models—handy for hooking up OpenAI Gym environments without rewriting your environment in C++.
The interesting bit
The library is deliberately designed around a 10-method interface: any custom model that implements those methods can be fed to the solvers. That templated approach in C++ gets flattened into pre-instantiated Python classes, so you gain extensibility on the C++ side and convenience on the Python side. The README also notes the Cassandra POMDP file parser, which lets you reuse existing problem definitions from the POMDP.org ecosystem rather than coding models by hand.
Key highlights
- Covers bandits, single-agent MDPs, and POMDPs with a long list of algorithms (Dyna-Q, MCTS, Thompson Sampling, WoLF, etc.)
- Python bindings support both Python 2 and 3, with examples showing near-identical C++ and Python code
- Native Python generative model support means you can sample state transitions and rewards directly without explicit transition matrices
- Extensive utility layer: polytopes, linear programming, combinatorics, factored data structures, belief updating
- Published in JMLR (2020), so there’s a proper academic citation if you use it for research
Caveats
- The README warns that serious customization or novel algorithms still require dropping into C++
- Python lacks templates, so the bindings are pre-instantiated; you may hit limitations if you need exotic numeric types
Verdict
Worth a look if you’re doing classical RL research or teaching and want a readable, well-documented C++ foundation with Python training wheels. Skip it if you need deep neural network integration—this is old-school tabular and factored methods, not PyTorch.