← all repositories

zyushun/Adam-mini

A memory-efficient Adam optimizer variant for training deep learning and large language models.

458 stars Python ML Frameworks
Adam-mini
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

Adam-mini is a PyTorch optimizer that reduces Adam’s memory footprint by 50% while maintaining or exceeding performance. It partitions parameters into blocks based on Hessian structure and assigns a single learning rate per block, eliminating over 99.9% of redundant lr resources in the v state. The implementation is drop-in compatible with AdamW hyperparameters and supports transformer-specific configurations including multi-head attention parameters.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.