dome272/Wuerstchen
A text-to-image diffusion model framework that achieves 42x compression by moving text conditioning into a highly compressed latent space.

Velocity · 7d
+0.5
★ / day
Trend
→steady
star history
Würstchen is a diffusion-based text-to-image framework that introduces multi-stage latent space compression. Stage A and B compress images while Stage C handles text conditioning in a low-dimensional space, enabling fast and computationally cheap training. The model achieves 42x compression while maintaining faithful image reconstruction and is fully integrated into the Hugging Face diffusers library.