mit-han-lab/distrifuser
A training-free algorithm that parallelizes diffusion model inference across multiple GPUs to accelerate high-resolution image generation.

Velocity · 7d
+0.9
★ / day
Trend
→steady
star history
DistriFusion addresses the latency challenge of generating high-resolution images with diffusion models by distributing the computational load across multiple GPUs. It introduces a patch-based parallelization approach that maintains inter-patch communication without sacrificing quality. The method requires no training and has been integrated into production systems like NVIDIA TensorRT-LLM and ColossalAI.