Teaching a neural net to rage-quit at pipes
A bare-bones PyTorch implementation of Deep Q-Network that learns to play Flappy Bird the hard way: by dying thousands of times.

What it does
This repo trains a Deep Q-Network to play Flappy Bird using PyTorch. You get two scripts: train.py to watch the agent flail and eventually improve, and test.py to see if it survives longer than you do. A pre-trained model is included if you’d rather skip the GPU hours.
The interesting bit
There’s no reward shaping or curriculum tricks — just raw Q-learning against a brutally simple environment where the only feedback is survival. The README calls it “a very basic example,” which undersells how much reinforcement learning pedagogy lives in getting this particular bird through those particular pipes.
Key highlights
- Pure PyTorch implementation, no RL framework abstractions hiding the mechanics
- Includes pre-trained model at
trained_models/flappy_bird - Visual feedback via pygame; you can watch the agent learn (or fail)
- Requires only standard stack: Python 3.6, PyTorch, OpenCV, numpy, pygame
- ~550 stars suggests it has served as a common RL entry point
Caveats
- README is minimal: no discussion of network architecture, hyperparameters, or training duration
- Python 3.6 requirement is dated; compatibility with newer versions is unclear
- No mention of GPU requirements, convergence time, or expected performance metrics
Verdict
Good for someone who wants to see DQN wiring without wrappers like Stable-Baselines. Skip it if you need modern best practices, distributed training, or documentation that explains why things work.