← all repositories
keon/deep-q-learning

DQN in 100 lines: the tutorial code that won't bit-rot

A minimal Keras implementation of DQN and Double DQN, recently dragged into 2026 with gymnasium and Keras 3 support.

1.3k stars Python AgentsML Frameworks
deep-q-learning
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

This repo contains stripped-down implementations of Deep Q-Network (DQN) and Double DQN (DDQN) agents using Keras, targeting the classic CartPole-v1 environment. The core dqn.py fits in under 100 lines. There’s also a batched-update variant (dqn_batch.py) and the more stable ddqn.py with actual Double DQN separation (online network picks actions, target network evaluates them).

The interesting bit

Most tutorial repos from 2017 are archaeological sites by now. This one got a 2026 overhaul: migrated from the dead gym package to gymnasium, updated for Keras 3’s API changes (Input layers, keras.losses.Huber, learning_rate instead of lr), and made ddqn.py environment-agnostic by stripping out CartPole-specific reward shaping. The save()/load() helpers even persist epsilon now, so you don’t resume training with a fully greedy agent by accident.

Key highlights

  • Three variants: basic DQN, batched-update DQN, and Double DQN with Huber loss
  • Experience replay memory capped via deque (no unbounded list growth)
  • Modern API: gymnasium’s reset/step return format, Keras 3 compatible
  • Companion blog post walks through the dqn.py line-by-line
  • ddqn.py implements proper decoupled action selection and evaluation

Caveats

  • The basic dqn.py is explicitly noted as potentially unstable; use ddqn.py for reliable training
  • CartPole-only demonstrations; you’ll need to wire up your own environment wrappers for anything else

Verdict

Good for someone who wants to read a complete DQN implementation in one screenful and actually run it without dependency archaeology. Not for production RL pipelines — the value is pedagogical density, not scalability.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.