← all repositories

mit-han-lab/efficientvit

Efficient vision foundation models for high-resolution image generation using diffusion architectures and vision transformers.

efficientvit
Velocity · 7d
+2.9
★ / day
Trend
steady
star history

This repository provides efficient vision foundation models including Deep Compression Autoencoders (DC-AE) for high-resolution diffusion models and vision transformer architectures. It supports tasks such as image generation on ImageNet at 512x512 resolution and segmentation. The project includes implementations of models like SANA (text-to-image) and USiT for state-of-the-art generation quality.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.