A Keras implementation that actually explains itself
Matterport's Mask R-CNN repo ships with notebooks that let you inspect every layer, anchor, and weight histogram instead of treating the model as a black box.

What it does
This is a Keras/TensorFlow implementation of Mask R-CNN, the two-stage network that spits out both bounding boxes and pixel-level segmentation masks for each object instance. It runs on a ResNet101 backbone with a Feature Pyramid Network on top. The repo includes pre-trained MS COCO weights, multi-GPU training support, and examples for training on your own dataset.
The interesting bit
Most computer vision repos hand you a script and wish you luck. This one includes six Jupyter notebooks that let you step through the pipeline—anchor filtering, box refinement, mask generation, layer activations, weight histograms—so you can see exactly where your model is confused or exploding. It’s essentially a debugger disguised as a model zoo entry.
Key highlights
- Pre-trained COCO weights available from the releases page; demo notebook runs inference on arbitrary images in minutes
ConfigandDatasetbase classes designed to be subclassed for custom datasets without touching model code- Includes
ParallelModelclass for multi-GPU training - Bounding boxes are generated on-the-fly from masks rather than loaded from datasets, which simplifies augmentation and multi-dataset training
- Authors openly document where they deviated from the original paper: fixed 1024×1024 resizing instead of variable sizing, lower learning rate to avoid weight explosion, and the bounding-box generation approach above
Caveats
- Requirements specify Python 3.4, TensorFlow 1.3, and Keras 2.0.8—this is a 2017-era stack, so expect friction with modern environments
- The README notes speed improvements are a desired contribution, suggesting the Python-forward implementation isn’t optimized for production throughput
Verdict
Grab this if you’re learning instance segmentation, debugging a Mask R-CNN variant, or need a well-documented starting point for custom datasets. Skip it if you want a production-ready, state-of-the-art detector for modern TensorFlow/PyTorch stacks.