← all repositories

deepseek-ai/DualPipe

A bidirectional pipeline parallelism algorithm that overlaps forward and backward computation with communication during LLM training.

3k stars Python ML FrameworksLanguage Models
DualPipe
Velocity · 7d
+6.4
★ / day
Trend
steady
star history

DualPipe implements an innovative scheduling strategy for distributed training of large language models, enabling full overlap of computation and communication phases while reducing pipeline bubbles. It provides both a full bidirectional schedule and a more memory-efficient V-shape variant (DualPipeV). The algorithm is designed specifically for training DeepSeek V3 and R1 models across multiple devices.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.