← all repositories
siemanko/tensorflow-deepq

A 1,166-star gravestone to early TensorFlow RL

This repo's own author killed it in favor of OpenAI Baselines—here's what remains.

1.2k stars Jupyter Notebook ML FrameworksAgents
tensorflow-deepq
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does An early (circa 2015-2016) Deep Q-Learning implementation in TensorFlow, built around a simple modular framework: controllers pick actions, simulators run environments, and a simulate() function glues them together. Includes a “Karpathy game” demo where a neural net learns to chase green dots, avoid red ones, and flee orange penalties.

The interesting bit The author—later at OpenAI—left this up as a historical artifact with a blunt “now obsolete” banner and a link to Baselines. The human controller via Redis is a charmingly over-engineered touch: you can literally SSH into your own RL agent.

Key highlights

  • Modular tf_rl design: swap controllers (DeepQ, human, your own) or simulators via clean interfaces
  • store() + training_step() pattern: explicit transition logging with per-step training (the docs warn “should not take too long”)
  • Built-in GIF generation pipeline via Inkscape frames
  • Human controller requires local Redis server for real-time input
  • 1,166 stars despite the author actively telling people to leave

Caveats

  • Explicitly abandoned; author redirects to OpenAI Baselines
  • Dependencies are pinned to ancient versions (future==0.15.2, euclid==0.1)
  • No topics, no recent commits, no community activity visible

Verdict Worth a quick scroll if you’re writing a history of Deep Q-Learning implementations or want to see how early TensorFlow RL code was structured. Skip it entirely if you actually need to train an agent today—Baselines, Stable-Baselines3, or CleanRL will save you hours of dependency archaeology.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.