NVlabs/Sana
SANA is a linear diffusion transformer for efficient high-resolution text-to-image and text-to-video generation.

Velocity · 7d
+14
★ / day
Trend
→steady
star history
SANA is a generative AI model that produces high-resolution images and videos from text prompts using a linear diffusion transformer architecture. It focuses on efficiency improvements in diffusion models through techniques like linear transformers and supports various inference backends including ComfyUI, SGLang, and Cosmos-RL. The project includes multiple variants such as Sana-1.5, Sana-Sprint, and Sana-Video for different capabilities.