← all repositories

lucidrains/transfusion-pytorch

A PyTorch library implementing MetaAI's Transfusion, a multi-modal model that jointly performs next-token language prediction and image generation using flow matching.

transfusion-pytorch
Velocity · 7d
+2.1
★ / day
Trend
steady
star history

This library provides a PyTorch implementation of the Transfusion architecture, which unifies autoregressive language modeling with continuous diffusion-based generation in a single transformer model. It handles mixed sequences of text tokens and continuous modality representations (such as images encoded as latents), enabling training on interleaved text-image data. The implementation supports classifier-free guidance for improved generation quality and can be extended to arbitrary numbers of modalities.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.