← all repositories

Vchitect/Latte

Latte is a latent diffusion transformer for generating videos from text or image conditions, published in TMLR 2025.

1.9k stars Python Image · Video · Audio
Latte
Velocity · 7d
+2.0
★ / day
Trend
steady
star history

Latte implements a latent diffusion transformer architecture for high-quality video synthesis. The repository provides PyTorch model definitions, pre-trained checkpoints on HuggingFace, and complete training and sampling pipelines. It supports text-to-video and image-to-video generation tasks, serving as the official implementation of the corresponding research paper.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.