Is ml-mdm open source?

Yes — apple/ml-mdm is open source, released under the MIT license.

What language is ml-mdm written in?

apple/ml-mdm is primarily written in Python.

How popular is ml-mdm?

apple/ml-mdm has 515 stars on GitHub.

Where can I find ml-mdm?

apple/ml-mdm is on GitHub at https://github.com/apple/ml-mdm.

← all repositories

apple/ml-mdm

Nesting resolutions lets this diffusion model train on just 12M images

Apple’s reference implementation trains a single text-to-image diffusion model at nested resolutions up to 1024×1024 without demanding massive datasets.

★515 stars Python Image · Video · Audio ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

ml_mdm is Apple’s reference implementation of Matryoshka Diffusion Models, a framework for training text-to-image and video diffusion models at multiple resolutions in one go. It bundles the training pipeline, pretrained weights, and a local web demo into a single Python package. Instead of chaining separate upsamplers, you train one pixel-space model that handles everything from 64×64 to 1024×1024.

The interesting bit

The “matryoshka” idea is nesting lower-resolution diffusion processes inside higher-resolution ones so the model reuses structure across scales. The authors report strong zero-shot generalization on CC12M—only 12 million images—which is unusually small for high-resolution diffusion work.

Key highlights

Pretrained checkpoints are available for 64×64, 256×256, and 1024×1024 generation, trained on 50 million Flickr image-text pairs.
The repo includes configs and scraping scripts to reproduce the CC12M data-efficient training claims from scratch.
SimpleParsing wires dataclass configs directly to CLI arguments and YAML files, keeping the configuration logic relatively tidy.
A model and pipeline registry lets you swap between U-Net and Nested U-Net architectures without rewriting the training loop.
The default install targets CPU-only machines for development; GPU tests are opt-in via pytest markers.

Verdict

Researchers who need to train or fine-tune high-resolution diffusion models without industrial-scale datasets should find this a useful starting point. If you are looking for a drop-in replacement for Stable Diffusion, this is research code with its own architecture, not a compatible plug-in.

Frequently asked

What is apple/ml-mdm?: Apple’s reference implementation trains a single text-to-image diffusion model at nested resolutions up to 1024×1024 without demanding massive datasets.
Is ml-mdm open source?: Yes — apple/ml-mdm is open source, released under the MIT license.
What language is ml-mdm written in?: apple/ml-mdm is primarily written in Python.
How popular is ml-mdm?: apple/ml-mdm has 515 stars on GitHub.
Where can I find ml-mdm?: apple/ml-mdm is on GitHub at https://github.com/apple/ml-mdm.