Swap a U-Net's backbone, win Kaggle
A VGG11 encoder pre-trained on ImageNet turns out to be a surprisingly effective drop-in replacement for U-Net's default encoder.

What it does
TernausNet is a U-Net variant for binary image segmentation that swaps the standard encoder for a VGG11 backbone pre-trained on ImageNet. It was the core of the first-place solution in the 2017 Carvana Image Masking Challenge (735 teams). The repo provides the architecture and pre-trained weights; a separate repo hosts the full training pipeline.
The interesting bit
The pre-trained encoder transfers well even to visually distant domains — the README shows it converging faster on aerial imagery, not just cars. That “ImageNet features are generic” hunch pays off here in a very specific, measurable way.
Key highlights
- Won Kaggle Carvana challenge (1st / 735 teams) in 2017
- Encoder pre-trained on ImageNet, not the target segmentation data
pip install ternausnetfor quick use- Published arXiv paper with full architecture details
- Training/testing example lives in a separate repo:
ternaus/robot-surgery-segmentation
Caveats
- The README is sparse: no code examples, no API docs, no usage snippet beyond
pip install - The actual training pipeline is off in another repository; this repo is essentially model weights + architecture definition
- Last meaningful update appears to be 2018; modern U-Net variants (with ResNet, EfficientNet encoders) have since surpassed this
Verdict
Worth a look if you’re studying how transfer learning accelerates segmentation convergence, or need a battle-tested baseline. Skip it if you want a maintained, batteries-included library — this is a research artifact with a Kaggle pedigree.