← all repositories

KellerJordan/Muon

Muon is a custom optimizer for training the hidden weights of neural networks, designed to work alongside AdamW.

2.7k stars Python ML Frameworks
Muon
Velocity · 7d
+4.6
★ / day
Trend
steady
star history

This repository implements the Muon optimizer originally described in public posts. It is specifically designed to optimize hidden weights in neural networks while using AdamW for embeddings, classifier heads, and other parameters. The implementation provides a MuonWithAuxAdam class that allows mixing both optimizers in a single training run with separate parameter groups.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.