Teaching neural nets to find needles in haystacks of images
A PyTorch implementation of attention-based multiple instance learning for when you only know the bag's label, not each item's.

What it does
This repo implements a 2018 paper on Multiple Instance Learning (MIL) with an attention mechanism. In MIL, you get a “bag” of images and only the bag has a label — say, “contains a 9” — while individual images inside go unlabeled. The model must learn to spot which instances matter without ever being told directly. The included code runs an MNIST experiment where bags of handwritten digits are classified this way.
The interesting bit
The attention mechanism sits before the final layer of a modified LeNet-5, learning to weight each instance’s importance rather than treating them equally. It’s a neat trick for medical imaging scenarios — the authors point to breast and colon cancer datasets — where only a tiny fraction of cells in a tissue slide may actually be cancerous, but you only have slide-level labels.
Key highlights
- Attention-based MIL pooling layer in a modified LeNet-5 CNN
- MNIST-BAGS experiment with configurable bag generation
- Two data loaders: a concise one (bags ≤10 items) and the original experimental one (any length, but only tested for target digit ‘9’)
- 20-epoch training with Adam, outputs bag-level and instance-level predictions
- Authors note the histopathology model is “similar” to the included one; details are in the paper
Caveats
- PyTorch 0.3.1 dependency (released 2018); Python 2.7 was the tested version
- No validation set, no early stopping — the authors deliberately kept the setup small
- The histopathology datasets aren’t included; you must download them separately and adapt the model yourself
- Authors explicitly state they “cannot guarantee any support for this software”
Verdict
Worth a look if you’re implementing MIL from scratch or working with weakly-labeled medical imaging. Skip it if you need a maintained, production-ready library — this is research code from 2018 with the dependencies to match.