← all repositories
ramprs/grad-cam

Shine a flashlight on your CNN's reasoning

Grad-CAM generates visual heatmaps showing exactly which image regions convinced a neural network to say "cat" instead of "dog."

1.7k stars Lua Other AIComputer Vision
grad-cam
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does Grad-CAM produces coarse localization maps highlighting the important regions in an image for predicting a concept — essentially asking a CNN “where are you looking?” The repo bundles three Torch/Lua scripts: one for image classification, one for visual question answering, and one for image captioning. Feed it an image and a target label (or VQA answer, or caption), and it spits out a heatmap overlay.

The interesting bit The technique works without retraining or architectural surgery — it hooks into existing Caffe models (VGG, AlexNet) via gradient flow through a chosen convolutional layer. The VQA and captioning demos are the real flex: you can force the model to explain why it answered “green” versus “yellow” for the same fire-hydrant image, revealing how fragile or context-dependent its “reasoning” is.

Key highlights

  • Classification, VQA, and captioning pipelines in one repo
  • Uses pretrained Caffe models; no fine-tuning required
  • GPU/CPU toggle, heatmap or raw output, layer selection all exposed as CLI flags
  • BSD license; submodules pull in VQA_LSTM_CNN and neuraltalk2 dependencies
  • Live demo at gradcam.cloudcv.org if you want to skip the Lua toolchain

Caveats

  • Built for Torch7 and Caffe — a 2017-era stack that now feels archaeological
  • Requires manual submodule init and model downloads; setup is not one-command
  • Layer names default to VGG-specific values (relu5_3, relu5_4), so using other architectures means reading the source

Verdict Grab this if you’re doing interpretability research on legacy models or need a reproducible baseline for Grad-CAM citations. Skip it if you want modern PyTorch/TensorFlow implementations — those exist elsewhere and won’t make you wrangle .caffemodel files.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.