← all repositories
vietnh1009/Super-mario-bros-A3C-pytorch

Mario learns to jump with kindergarten math

A stripped-down PyTorch A3C implementation that proves you don't need 500 lines of boilerplate to teach an agent to finish World 1-1.

1.1k stars Python AgentsML Frameworks
Super-mario-bros-A3C-pytorch
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does Trains an AI agent to play Super Mario Bros using the Asynchronous Advantage Actor-Critic (A3C) algorithm from the 2016 DeepMind paper. Multiple agents explore the game in parallel, sharing gradient updates to escape local optima faster than a lone plumber. The repo includes train.py, test.py, and a Google Drive folder of pre-trained weights.

The interesting bit The author deliberately stripped away the usual cruft—fancy preprocessing pipelines, exotic weight initializations, environment wrappers—to show that “minimal setup + correct algorithm = working agent.” The README also contains an extended “dad and kid at kindergarten” analogy that actually explains actor, critic, advantage, and asynchrony without a single equation.

Key highlights

  • Pure PyTorch, no distributed-training frameworks required
  • Pre-trained models reportedly clear 19 stages (up from the author’s initial 9, thanks to a community contribution)
  • Dependencies are barebones: Python 3.6, PyTorch, OpenAI Gym, OpenCV, NumPy
  • train.py and test.py are the only entry points—no config-file archaeology needed

Caveats

  • The README doesn’t specify hardware requirements, training time, or how many parallel workers are used
  • No code documentation or inline comments are shown; you’ll be reading the source directly
  • Pre-trained weights live on Google Drive with no versioning or checksums mentioned

Verdict Grab this if you want a readable, no-magic A3C reference implementation in PyTorch. Skip it if you need production-grade reproducibility, hyperparameter sweeps, or modern alternatives like PPO.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.