An educational reinforcement learning project where an agent learns to buy and sell a single stock through rewards and penalties, not explicit instructions.
Agents
heavyweights · gaining speedPhillip is a Super Smash Bros. Melee AI trained with deep reinforcement learning to brawl in Dolphin emulator—though its creator has since moved on to imitation learning.
A straightforward TensorFlow implementation of A3C that trains Pong agents for 26 hours and actually shows its work.
A Java-based bot platform that predates the LLM era and still runs on JUnit and Objective-C.
A C# toolkit for wiring up sensors, AI models, and actuators when latency actually matters.
A TensorFlow port of NIPS 2016's Best Paper, embedding value iteration directly inside a neural network for grid-world navigation.
A teaching-oriented AI bot that abstracts away the differences between Brood War and SC2 so you can focus on strategy, not API archaeology.
A Keras/TensorFlow project that learns your keyboard and mouse habits by watching you play, then attempts to mimic them.
A PyTorch reimplementation of an AAAI 2018 paper that frames video summarization as a reinforcement learning problem, rewarding diversity and representativeness instead of ground-truth labels.
A PyTorch DOOM bot that won the ViZDoom AI Competition by learning to frag with deep reinforcement learning.
A 2017 RL framework that treats multi-hop KG reasoning as pathfinding with embedding-based states and a reward function that cares about accuracy, diversity, and efficiency—not just getting there.
A lightweight, rule-based intent parser for voice assistants that trades ML complexity for explicit control.
Tock is an open-source conversational AI platform for teams who want to build bots without surrendering their data to a SaaS black box.
Reaver squeezed 1.5x sampling speed from single-machine setups by ditching MPI for lock-free shared memory, then the author walked away.
A 2018 AAAI paper that fixes the classic GAN problem for text generation—scalar rewards arriving too late—by letting the discriminator leak its own hidden features mid-generation.
A 2016-era DQN agent that learns Super Mario World from raw pixels, with a Spatial Transformer to focus on what matters.
A Ruby framework that treats bot-building like web development, with MVC architecture and Redis-backed state machines.
An enterprise team's attempt to make Rasa NLU actually deployable without a PhD.
A tidy reference implementation of five major deep RL algorithms, pinned to a very specific Keras version from 2018.
A browser-based reinforcement learning demo that teaches a bird to fly by dying repeatedly.







