← all repositories

FoundationVision/VAR

Visual Autoregressive (VAR) model generates images using GPT-style next-scale prediction, achieving state-of-the-art results and demonstrating scaling laws in visual generation.

8.7k stars Jupyter Notebook Image · Video · AudioLanguage Models
VAR
Velocity · 7d
+11
★ / day
Trend
steady
star history

VAR reimagines image generation as a next-scale prediction task rather than next-token prediction, enabling GPT-style transformers to compete with and surpass diffusion models. The official implementation provides training and inference code for autoregressive image generation across multiple model sizes. It won NeurIPS 2024 Best Paper Award for demonstrating scaling laws in visual generation comparable to language model scaling.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.