When neural networks and probabilistic graphs had a baby
A 2015 ICCV paper that fused CRFs into RNNs for pixel-perfect segmentation, back when Caffe was king and PyTorch was a glimmer.

What it does
CRF-RNN performs semantic image segmentation: it labels every pixel in an image (20 object classes plus background) and traces the actual 2D outline of each object, not just bounding boxes. The code is a custom Caffe layer called MultiStageMeanfieldLayer that plugs into a standard FCN pipeline.
The interesting bit
The trick is formulating a Conditional Random Field as a recurrent neural network, then training the whole thing end-to-end. The CRF’s mean-field inference becomes unrolled iterations inside the network, so the CNN learns to produce unary potentials that play nicely with the structured refinement stage. In 2015 this was novel enough to win the ICCV Best Demo Prize.
Key highlights
- Ships with Python and MATLAB wrappers, plus a live web demo at
crfasrnn.torr.vision - Pre-trained on PASCAL VOC (21 classes including background)
- Active third-party ecosystem: PyTorch, Keras/TensorFlow, Lasagne, and pure GPU reimplementations all exist
- BSD 3-clause license
- Originally built for augmented-reality glasses to assist the partially sighted
Caveats
- Requires compiling a modified Caffe fork; the README targets Ubuntu 14.04 and involves blacklisting kernel modules for CUDA
- The
deploy.prototxtis hard-wired for PASCAL VOC output dimensions; changing classes means manually re-initializing deconvolution weights - Layer names matter: rename
inference1and the pre-trained weights silently fail to load, producing all-black predictions
Verdict
Worth studying if you care about the history of dense prediction or need to understand why modern architectures (DeepLab, etc.) adopted similar CRF-as-layer ideas. For production use, the actively maintained PyTorch and Keras ports are the pragmatic choice; this repo is essentially a research artifact frozen in Caffe amber.