← all repositories
coreylynch/async-rl

DeepMind's async RL paper, reassembled from spare Keras parts

A readable, low-RAM implementation of A3C's predecessor that runs on a 4 GB MacBook.

1k stars Python AgentsML Frameworks
async-rl
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does Implements asynchronous 1-step Q-learning from DeepMind’s 2016 paper using Keras for the network, TensorFlow for optimization, and OpenAI Gym for Atari environments. Multiple actor-learner threads replace the usual experience replay buffer, which keeps memory usage low enough for modest hardware.

The interesting bit The author built this to learn TensorFlow, not to win benchmarks — and openly admits it. That honesty is refreshing: he notes the original paper averaged “the best 5 models from 50 experiments,” a detail he initially missed and which explains why single runs can look like failures. It’s a practical warning dressed as a README footnote.

Key highlights

  • Runs on a MacBook with 4 GB RAM by skipping experience replay entirely
  • Keras model definition is cleanly separated in model.py
  • Includes TensorBoard logging for episode rewards and max Q values
  • Evaluation mode produces Gym-compatible uploads
  • Partial A3C implementation exists in a3c.py as a next-step stub

Caveats

  • The author warns of high variance run-to-run; you may need multiple seeds
  • Built against TensorFlow r0.9 and old Gym APIs, so expect bit-rot
  • a3c.py is explicitly marked work-in-progress

Verdict Worth a look if you’re teaching yourself async RL and want readable, commented code before diving into production frameworks. Skip it if you need battle-tested implementations or modern PyTorch equivalents.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.