← all repositories

luyug/GradCache

A memory-efficient training technique for scaling contrastive learning batch size far beyond GPU/TPU constraints using gradient caching.

442 stars Python ML FrameworksRAG · Search
GradCache
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

Gradient Cache enables training contrastive learning models with arbitrarily large batch sizes on limited hardware by caching gradients and only keeping one model copy in memory at a time. It supports both PyTorch and JAX/Flax frameworks, making it adaptable across deep learning ecosystems. The technique was developed for dense passage retrieval (DPR) systems and embedding training, which are core components of RAG pipelines and LLM application stacks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.