← all repositories
LucasAlegre/sumo-rl

Teaching traffic lights to think with reinforcement learning

A Gymnasium/PettingZoo wrapper that turns SUMO traffic simulations into RL environments for signal control.

1.1k stars Python ML FrameworksDomain Apps
sumo-rl
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

SUMO-RL wraps the SUMO traffic simulator so you can train RL agents to control traffic lights. It exposes single-intersection problems as standard Gymnasium environments and multi-light grids as PettingZoo multi-agent environments. The heavy lifting—vehicle physics, lane logic, phase transitions—is still SUMO’s; this library handles the translation layer.

The interesting bit

The reward function is deliberately simple: change in cumulative vehicle delay from one step to the next. That means your agent gets positive feedback for reducing total waiting time, negative for making it worse. You can swap in custom observation functions or rewards by inheriting from ObservationFunction or passing a Python callable to reward_fn—no need to fork the library.

Key highlights

  • Single-agent mode via gym.make('sumo-rl-v0', ...); multi-agent via parallel_env() with PettingZoo’s parallel API
  • Ships with RESCO benchmark networks (grid4x4, etc.) and example scripts for stable-baselines3 DQN, RLlib PPO, and Q-learning
  • Default observation encodes phase state, lane density, and queue lengths as normalized vectors
  • Optional ~8x speedup via Libsumo (LIBSUMO_AS_TRACI=1), though you lose the GUI and parallel simulations
  • Published enough that there’s a citation bibtex and a dozen+ papers using it

Caveats

  • Requires separate SUMO installation (PPA on Ubuntu, or equivalent) plus SUMO_HOME environment variable
  • Libsumo fast path is mutually exclusive with sumo-gui and multi-simulation parallelism
  • The README notes stable-baselines3 needs a pre-release (>=2.0.0a9) for Gymnasium compatibility

Verdict

Worth a look if you’re doing RL research on urban traffic control and want standard APIs without writing TraCI boilerplate. Skip it if you need real-time production signal optimization—this is a research sandbox, not a city operations tool.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.