← all repositories
tanluren/yolov3-channel-and-layer-pruning

YOLO on a diet: shrinking detection models by force

A Chinese-language repo that squeezes YOLOv3/v4 through channel pruning, layer pruning, and knowledge distillation for edge deployment.

yolov3-channel-and-layer-pruning
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does This project takes Ultralytics’ YOLOv3/v4 implementation and puts it through a multi-stage compression pipeline. You train normally, then run “sparse training” to crush batch-normalization gamma coefficients toward zero, then prune channels or entire shortcut layers based on those coefficients. A final finetune stage recovers accuracy, optionally guided by knowledge distillation from the original fat model.

The interesting bit The repo doesn’t pick one pruning strategy—it implements three competing channel-pruning approaches (conservative shortcut-avoiding, mask-sharing, and union-mask) plus a derived layer-pruning strategy that carves out entire shortcut blocks. The author also added two knowledge-distillation strategies: a basic Hinton-style classification distillation and a detection-specific variant where the student only learns from the teacher when the teacher is closer to the ground-truth target than the student is.

Key highlights

  • Supports YOLOv3, YOLOv3-SPP, YOLOv3-tiny, YOLOv4, and YOLOv4-tiny
  • Three sparse-training schedules: constant penalty, global decay at 50% epochs, or local decay on the least-important 15% of channels
  • Layer pruning removes shortcut blocks (up to 48 of 69 eligible layers in YOLOv3) for speed gains beyond what channel pruning alone achieves
  • Knowledge distillation via --t_cfg and --t_weights flags during finetuning
  • Mixed-precision training via NVIDIA Apex for faster iteration

Caveats

  • README and code comments are entirely in Chinese; English speakers will need translation help
  • Several “strategies” require uncommenting hardcoded lines in source files rather than command-line flags
  • The author notes that sparse training is “the top priority” and that finding the right penalty coefficient s takes significant trial and error

Verdict Worth a look if you’re shipping YOLO to resource-constrained hardware and can invest time in tuning the sparse-training hyperparameters. Skip it if you need a polished, one-command solution or don’t read Chinese.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.