A Korean med-tech lab's RL zoo: 15 algorithms, one codebase
Medipixel open-sourced their internal reinforcement-learning research stack, from Rainbow IQN to GAIL, with Weights & Biases logs for every run.

What it does
This is a PyTorch implementation of core deep-RL algorithms maintained by Medipixel, a Korean medical AI company. It covers the standard zoo—DQN variants (Rainbow, IQN, R2D1, Ape-X), policy-gradient methods (A2C, PPO, ACER), and continuous-control actors (DDPG, TD3, SAC)—plus imitation-learning extras like BC, DQfD, GAIL, and policy distillation. Everything is wired for OpenAI Gym environments with provided configs for Pong, LunarLander, and Reacher.
The interesting bit
The authors actually use this for research, and it shows in the details: they deliberately dropped DuelingNet from Rainbow IQN after finding it degrades performance, and they ship a ResNet backbone variant alongside the standard MLP. The repo also includes a class diagram—unusual for an RL codebase—suggesting they care about structural clarity over quick scripting.
Key highlights
- 15 algorithms with dedicated subdirectories, not monolithic spaghetti
- W&B integration with public logs for most experiments; reproducibility claims tied to specific commit hashes
- Ape-X DQN tested with 4 workers, showing ~2× wall-clock speedup over serial Dueling DQN on Pong
- Rainbow IQN hits perfect score (21) on PongNoFrameskip-v4 within 100 episodes per their logs
- Includes less-common implementations: R2D1 (recurrent replay), ACER, and policy distillation
Caveats
- Performance section carries an explicit “won’t be frequently updated” warning
- Many LunarLander experiments are described as “quick verification,” not tuned baselines
- Computing-power limitations meant Ape-X was only tested with 4 workers; scaling behavior unclear
Verdict
Worth a look if you’re implementing RL from papers and want a second opinion on architecture choices, or if you need a teaching codebase with breadth. Skip it if you want a polished, actively maintained framework like Stable-Baselines3—this is research-grade glue with rough edges.