← all repositories

AGI-Arena/MARS

MARS is an optimizer framework that combines variance reduction with preconditioned updates to accelerate training of large language models.

721 stars Python ML FrameworksLanguage Models
MARS
Velocity · 7d
+1.3
★ / day
Trend
steady
star history

MARS (Make Variance Reduction Shine) provides a unified optimization framework addressing gradient variance challenges in large model training. It implements a scaled stochastic recursive momentum for variance-reduced gradient estimation combined with a preconditioned update approximating second-order Newton methods. The implementation includes CUDA kernels and supports both pretraining (GPT-2 XL, FineWeb-Edu) and fine-tuning workflows for large language models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.