← all repositories
openai/universe-starter-agent

OpenAI's deprecated RL starter kit still teaches A3C basics

A reference A3C implementation for real-time environments, now archived in favor of Retro.

1.1k stars Python Agents
universe-starter-agent
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does This repo houses a starter reinforcement-learning agent built around the A3C (Asynchronous Advantage Actor-Critic) algorithm. It trains parallel workers to play Atari Pong, VNC-hosted games, and Flash titles like Neon Race, with TensorBoard monitoring and tmux orchestration baked in.

The interesting bit The code is explicitly tuned for VNC Pong and real-time environments where network lag becomes part of the problem. The README treats latency as a first-class concern—co-locate agent and environment or watch your reaction_time balloon and your policy collapse.

Key highlights

  • Basic A3C implementation adapted for real-time (non-frame-perfect) environments
  • Spawns tmux session with dedicated windows for each worker, parameter server, and TensorBoard
  • Solves local Pong in ~30 minutes with 16 workers on an m4.10xlarge, or ~10 minutes with 32 workers on an m4.16xlarge
  • Supports VNC remote environments with viewable agent behavior via TurboVNC
  • Flash games run at 5fps, making them feasible on modest core counts

Caveats

  • Deprecated: OpenAI points users to the Retro library instead
  • Tuned specifically for VNC Pong; performance on other tasks is explicitly not guaranteed
  • Dependency stack is frozen in time: TensorFlow 0.12, Python 2.7/3.5, and a Golang requirement

Verdict Worth a quick read if you’re studying A3C or real-time RL challenges, but don’t build on it—this is a museum piece. Active projects should start with Retro or modern Stable-Baselines3 instead.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.