← all repositories

microsoft/mup

A PyTorch package implementing maximal update parametrization (μP) for stable hyperparameter scaling across neural network widths.

1.7k stars Jupyter Notebook ML FrameworksLearning
mup
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

The mup package provides tools for implementing μP in PyTorch models, a technique that stabilizes optimal hyperparameters across different model sizes. It enables reliable hyperparameter transfer from small models to large ones, reducing uncertainty when scaling up neural networks. The research focuses on large pretrained transformers but applies generally to deep learning model scaling.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.