Sub-1M parameters, 71 FPS: a segmentation network that fits on a diet
LEDNet trades parameter bloat for speed by shuffling channels and slimming the decoder with an attention pyramid.

What it does LEDNet is a PyTorch implementation of a lightweight encoder-decoder network for real-time semantic segmentation, trained and evaluated on the Cityscapes dataset. The encoder uses a ResNet backbone where each residual block splits and shuffles channels to cut computation; the decoder employs an attention pyramid network (APN) to keep the whole model under 1 million parameters while pushing past 71 FPS on a single GTX 1080Ti.
The interesting bit The channel split-and-shuffle operation in the encoder is borrowed from ShuffleNet’s playbook, but applied here to dense prediction rather than classification. The asymmetry is deliberate: the encoder does the heavy lifting with cheap operations, while the APN decoder stays minimal rather than mirroring the encoder’s depth.
Key highlights
- Under 1M parameters; claims 70.6% class IoU and 70+ FPS on Cityscapes (fine + coarse labels)
- Encoder pre-training on ImageNet supported, with a separate fine-tuning path for the decoder
- Includes evaluation scripts for inference timing, official server submission, and per-class IoU
- Visualization via Visdom; torchsummary integration for model inspection
- Code targets PyTorch 0.4.1 / CUDA 9.0, though the README notes 0.4.1+ compatibility
Caveats
- The author notes results “need to be further improved” due to limited GPU resources during development
- Only Cityscapes is currently implemented; CamVid, VOC, and ADE20K are listed as future work
- Trained model links in the README appear incomplete ("./save/" path without actual file references)
Verdict Worth a look if you’re building real-time segmentation pipelines and need a proven lightweight baseline to benchmark against. Skip it if you need production-ready multi-dataset support or modern PyTorch versions without migration work.