← all repositories

jiaweizzhao/GaLore

A PyTorch optimizer that reduces LLM training memory by projecting gradients into low-rank space.

1.7k stars Python ML FrameworksLanguage Models
GaLore
Velocity · 7d
+2.1
★ / day
Trend
steady
star history

GaLore provides memory-efficient full-parameter learning for LLMs by projecting gradients into a low-rank subspace during training. It integrates with existing optimizers like AdamW, AdamW8bit, and Adafactor with minimal code changes. The method achieves comparable or better results than LoRA-style adapters while maintaining full-parameter learning capabilities, and has been extended with quantized variants like Q-GaLore.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.