OpenAI's deprecated RL starter kit still teaches A3C basics
A reference A3C implementation for real-time environments, now archived in favor of Retro.

What it does This repo houses a starter reinforcement-learning agent built around the A3C (Asynchronous Advantage Actor-Critic) algorithm. It trains parallel workers to play Atari Pong, VNC-hosted games, and Flash titles like Neon Race, with TensorBoard monitoring and tmux orchestration baked in.
The interesting bit
The code is explicitly tuned for VNC Pong and real-time environments where network lag becomes part of the problem. The README treats latency as a first-class concern—co-locate agent and environment or watch your reaction_time balloon and your policy collapse.
Key highlights
- Basic A3C implementation adapted for real-time (non-frame-perfect) environments
- Spawns tmux session with dedicated windows for each worker, parameter server, and TensorBoard
- Solves local Pong in ~30 minutes with 16 workers on an
m4.10xlarge, or ~10 minutes with 32 workers on anm4.16xlarge - Supports VNC remote environments with viewable agent behavior via TurboVNC
- Flash games run at 5fps, making them feasible on modest core counts
Caveats
- Deprecated: OpenAI points users to the Retro library instead
- Tuned specifically for VNC Pong; performance on other tasks is explicitly not guaranteed
- Dependency stack is frozen in time: TensorFlow 0.12, Python 2.7/3.5, and a Golang requirement
Verdict Worth a quick read if you’re studying A3C or real-time RL challenges, but don’t build on it—this is a museum piece. Active projects should start with Retro or modern Stable-Baselines3 instead.