Tencent/PatrickStar
A distributed training framework for large language models using chunk-based heterogeneous memory management.

Velocity · 7d
+0.4
★ / day
Trend
→steady
star history
PatrickStar enables training of very large NLP models by dynamically managing CPU and GPU memory. It uses a chunk-based memory management system that offloads model data between device types, allowing single nodes to train models far larger than GPU memory would normally permit. The system scales to multiple GPUs with efficient collective communication, competing with DeepSpeed Zero Stage 3.