Your neural network is lying about how sure it is
A dead-simple post-processing trick that learns a single scalar to stop softmax from being overconfident.

What it does
Temperature scaling is a one-parameter fix for a common pathology: neural networks output probabilities that sound more certain than they are. You learn a single scalar T on a held-out validation set, divide your logits by it, and suddenly your 80% confidence predictions are actually correct 80% of the time. The repo is one file (temperature_scaling.py) you copy into your project.
The interesting bit
The cleverness is in the restraint. No retraining, no architecture changes — just softmax = e^(z/T) / sum_i e^(z_i/T) where T is fit by minimizing negative log-likelihood. It’s post-hoc calibration: the model stays the same, only the thermometer changes.
Key highlights
- Single learned parameter, fit on validation data in one pass
- Works with any trained PyTorch classifier (wraps your existing model)
- Based on Guo et al.’s “On Calibration of Modern Neural Networks” (ICML 2017)
- Includes before/after calibration plots for ResNet on CIFAR-100
- Author explicitly recommends better-maintained alternatives like
probmetricsfor production use
Caveats
- Repo is unmaintained — written for PyTorch 0.3, eight years stale
- Requires careful validation-set hygiene: must use the same validation set for training and calibration, or you leak information
- Not a package; literally just a file to copy-paste
Verdict
Worth reading for the concept and the 50 lines of implementation, but don’t depend on it in production. Use it to understand temperature scaling, then switch to a maintained library.