FoundationVision/LlamaGen
LlamaGen applies autoregressive next-token prediction from LLMs to scalable image generation.

Velocity · 7d
+2.7
★ / day
Trend
→steady
star history
LlamaGen is a family of image generation models that adapt the next-token prediction paradigm of large language models to the visual domain. It trains autoregressive transformer models on discretized image tokens, eliminating the need for diffusion-based architectures. The project provides pre-trained model weights ranging from 100M to 3B parameters and training/sampling code in PyTorch, with vLLM support for inference serving.