Is Wuerstchen open source?

Yes — dome272/Wuerstchen is open source, released under the MIT license.

What language is Wuerstchen written in?

dome272/Wuerstchen is primarily written in Jupyter Notebook.

How popular is Wuerstchen?

dome272/Wuerstchen has 555 stars on GitHub.

Where can I find Wuerstchen?

dome272/Wuerstchen is on GitHub at https://github.com/dome272/Wuerstchen.

← all repositories

dome272/Wuerstchen

This diffusion model squeezes images 42× before generating them

To make text-to-image training cheaper, Würstchen squeezes images through two compression stages before running diffusion.

★555 stars Jupyter Notebook Image · Video · Audio ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Würstchen is a three-stage diffusion architecture. Stages A and B jointly compress images down to a 12×12 latent grid—a 42× compression factor—while Stage C performs the expensive text-conditional generation in that tiny space. The repository contains the official ICLR 2024 oral implementation, including training scripts and notebooks for each stage.

The interesting bit

The project treats compression as the primary optimization target rather than a mere preprocessing step. By offloading fidelity reconstruction to Stages A and B, it keeps the text-conditioned Stage C lean and cheap to train. It also plugs into the diffusers library via a standard API.

Key highlights

Three-stage design: Stages A and B handle 42× image compression; Stage C runs text-conditional diffusion in a 12×12 latent space.
Official implementation of an ICLR 2024 oral paper.
Training scripts provided for Stage B and Stage C, plus Jupyter notebooks for reconstruction and text-to-image generation.
Two released checkpoints: v1 (512×512, 800k steps) and v2 (1024×1024, 918k steps), with Stage C at 1B parameters.
Fully integrated into the Hugging Face diffusers library.

Caveats

Training scripts cover Stage B and Stage C, but Stage A is not included, so you will need the provided pretrained weights (or your own encoder) to get started.
The repository is structured as a research release—notebooks and standalone scripts rather than a packaged library.

Verdict

Worth exploring if you want to train custom text-to-image models with less compute, or if you are curious about multi-stage latent compression. Otherwise, the diffusers integration already covers casual usage.

Frequently asked

What is dome272/Wuerstchen?: To make text-to-image training cheaper, Würstchen squeezes images through two compression stages before running diffusion.
Is Wuerstchen open source?: Yes — dome272/Wuerstchen is open source, released under the MIT license.
What language is Wuerstchen written in?: dome272/Wuerstchen is primarily written in Jupyter Notebook.
How popular is Wuerstchen?: dome272/Wuerstchen has 555 stars on GitHub.
Where can I find Wuerstchen?: dome272/Wuerstchen is on GitHub at https://github.com/dome272/Wuerstchen.