← all repositories

CompVis/taming-transformers

Generative model for high-resolution image synthesis using VQGAN with autoregressive transformer composition.

6.5k stars Jupyter Notebook Image · Video · AudioML Frameworks
taming-transformers
Velocity · 7d
+3.3
★ / day
Trend
steady
star history

Taming Transformers introduces a convolutional VQGAN approach that learns a codebook of context-rich visual parts, whose composition is modeled with an autoregressive transformer. The method combines the efficiency of convolutional approaches with the expressivity of transformers for high-resolution image synthesis. The repository provides training code and pretrained models for class-conditional ImageNet synthesis, achieving state-of-the-art FID scores among autoregressive approaches.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.