← all repositories
google-research/morph-net

Shrink neural nets by making regularizers do the architecture search

Google Research's MorphNet learns which channels to kill by treating FLOP and memory budgets as differentiable penalties.

1k stars Python ML FrameworksOther AI
morph-net
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does MorphNet takes a trained “seed” neural network and figures out how many output channels each convolution layer actually needs. You pick a budget—FLOPs, model size, or even estimated latency on specific hardware—and add a regularization term to your training loss. The regularizer squeezes unimportant filters toward zero; once they cross a threshold, the corresponding channels are marked for removal. You export the proposed structure as JSON, rebuild a slimmer model, and retrain from scratch.

The interesting bit The newer FiGS approach (the LogisticSigmoid regularizer) turns this into a probabilistic gating problem: you insert simple gating layers after activations, and the regularizer learns stochastic architecture choices. It doubles as either a pruning method or a full differentiable architecture search, which is unusual for a technique that otherwise looks like standard regularized training.

Key highlights

  • Targets concrete resources: FLOPs, model size, or device-specific latency—not just abstract sparsity.
  • Three regularizer algorithms: LogisticSigmoid (recommended, works with any model), Gamma (for BatchNorm networks), and GroupLasso (deprecated, for models without BatchNorm).
  • Two-stage workflow: structure learning with the regularizer, then retraining the pruned model normally.
  • Hyperparameter guidance is unusually specific: scale regularization strength around 1/(initial cost) and use a fixed learning rate during structure learning.
  • Includes TensorBoard monitoring for regularization loss and estimated cost.

Caveats

  • The README still uses TensorFlow 1.x patterns (tf.Session, tf.layers, tf.train.MomentumOptimizer) and notes “TODO Add Keras example,” so modern TF/PyTorch users are on their own for API translation.
  • MorphNet does not change network topology—only channel counts per layer—so it won’t help if your problem is the wrong layer stack.
  • The regularization loss is not automatically added to tf.GraphKeys.REGULARIZATION_LOSSES; you must wire it in manually.

Verdict Worth a look if you’re already in TensorFlow and need to hit hard latency or memory constraints without hand-tuning layer sizes. Skip it if you need topology changes, PyTorch-native tooling, or a fully automated one-shot training pipeline.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.