← all repositories

princeton-nlp/MeZO

A memory-efficient zeroth-order optimizer that fine-tunes language models using only forward passes, reducing memory usage by up to 12x.

1.2k stars Python Language ModelsML Frameworks
MeZO
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

MeZO adapts classical zeroth-order SGD to operate in-place for language model fine-tuning, enabling training of 30B parameter models on a single 80GB GPU (vs 2.7B with Adam). The method achieves comparable performance to backpropagation-based fine-tuning across multiple tasks and supports both full-parameter and parameter-efficient tuning techniques such as LoRA and prefix tuning. It also enables optimization of non-differentiable objectives like accuracy or F1 scores.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.