← all repositories
Unity-Technologies/obstacle-tower-env

Unity's procedurally generated torture tower for AI agents

A gym-compatible 3D environment designed to break reinforcement learning agents that memorize instead of generalize.

546 stars Python AgentsDomain Apps
obstacle-tower-env
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

Obstacle Tower is a Unity-built, OpenAI Gym-compatible benchmark where an agent climbs procedurally generated floors—each a maze of platforming, puzzles, and enemies—to reach stairs and advance. Difficulty scales as the agent progresses, and visual themes, room layouts, and puzzle configurations shuffle on every run. Unity provides pre-built binaries for Linux, macOS, and Windows, with auto-download on first gym instantiation.

The interesting bit

The procedural generation isn’t window dressing; it’s the core thesis. Because floors recombine elements in unseen ways, agents can’t coast on memorized paths. The environment explicitly tests whether your RL agent actually learned vision, locomotion, and planning—or just overfit to a static level. There’s also a “Player Mode” for human control, presumably so you can personally experience why your algorithm is struggling.

Key highlights

  • Up to 100 floors in generated towers (v2.0+)
  • Extensive reset parameters for customizing difficulty, visual themes, and content
  • Built-in evaluation wrapper with benchmarking guidelines and pre-defined seeds
  • Auto-downloads platform binaries via gym wrapper registry (v4.0+)
  • Includes GCP + Dopamine Rainbow training guide and Jupyter notebook examples
  • Source code openly available in separate obstacle-tower-source repo

Caveats

  • Requires Python 3.6+ and Unity ML-Agents 1.x; version lock-in may complicate integration with newer ML-Agents releases
  • Long version history (4.1) with repeated hotfixes for determinism, reset parameter application, and memory leaks suggests the environment has been finicky in production
  • Docker memory leaks and environment freezes on higher floors were patched but may linger as edge cases

Verdict

Worth a look if you’re researching generalization in RL or need a visually rich, non-trivial 3D benchmark. Skip it if you want a stable, low-maintenance environment for quick algorithm prototyping—the Unity binary dependency and version-specific ML-Agents requirement add friction.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.