← all repositories
pranz24/pytorch-soft-actor-critic

A museum piece from the SAC era

A clean PyTorch reimplementation of Soft Actor-Critic, now archived and gathering dust.

942 stars Python AgentsML Frameworks
pytorch-soft-actor-critic
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

Implements Soft Actor-Critic (SAC), an off-policy reinforcement learning algorithm that maximizes both reward and policy entropy. The repo covers both the original 2018 stochastic-actor paper and the 2019 follow-up, plus a deterministic variant. It targets standard MuJoCo Gym environments like HalfCheetah and Humanoid.

The interesting bit

The author exposed knobs that often stay hidden: automatic entropy tuning, hard vs. soft target updates, and a deterministic policy mode. There’s even a tuned alpha per environment — the kind of detail you usually dig out of someone else’s blog post.

Key highlights

  • Reproduces both Haarnoja et al. (2018) and the updated (2019) SAC formulations
  • Optional deterministic policy with hard target updates, for ablation-minded researchers
  • Per-environment temperature (alpha) defaults baked in
  • Single-file implementation; dependencies are just PyTorch and mujoco-py

Caveats

  • Explicitly archived and unmaintained by the author
  • Hard-coded to MuJoCo Gym environments; no modern Gymnasium migration
  • mujoco-py itself is deprecated, so getting this running is increasingly archaeological

Verdict

Useful if you’re tracing SAC’s evolution or need a minimal reference to compare against your own implementation. Skip it if you want something production-ready or modern — the field has moved on, and so has the tooling.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.