← all repositories
Farama-Foundation/ViZDoom

Doom as a Petri dish for reinforcement learning

A research platform that turns the 1993 shooter into a fast, lightweight gym for training AI agents from raw pixels.

2k stars C++ AgentsML Frameworks
ViZDoom
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does ViZDoom wraps the ZDoom engine to create reinforcement learning environments where agents must play Doom using only visual input—the screen buffer. It exposes the game through Python and C++ APIs, including Gymnasium wrappers, and ships with examples in PyTorch and TensorFlow.

The interesting bit The engine gives you more than RGB frames. You can tap the depth buffer for 3D vision, the audio buffer, automatic object labels, even map geometry and actor lists—sensor modalities that would be expensive to synthesize in a modern game engine. All in a package the README claims is “few MBs” and hits 7000 frames per second on a single CPU thread in sync mode.

Key highlights

  • Multi-platform wheels for Linux (x86-64, ARM64), macOS (Apple Silicon; Intel stuck at v1.2.4), and Windows (x86-64 only)
  • Custom scenarios via visual editors and ACS scripting; Freedoom assets included, original Doom WADs drop-in replaceable
  • Off-screen rendering, episode recording, and time-scaled async mode for faster-than-real-time training
  • Active ecosystem: maze generators, generalization benchmarks (LevDoom), continual learning (COOM), and safe-RL testbeds (HASARD)

Caveats

  • Windows build is explicitly “not as well-tested”; serious experiments warrant Docker or WSL
  • Audio requires system OpenAL; macOS Intel and older Python versions fall back to source builds with Boost, CMake, and SDL2 dependencies

Verdict Grab this if you need a visually rich, cheap-to-run RL benchmark with decades of community tooling behind it. Skip it if you want photorealistic graphics or a plug-and-play Windows experience.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.