Alpha-VLLM/Lumina-mGPT-2.0

A decoder-only autoregressive model for unified image generation tasks including text-to-image synthesis, image editing, and controllable generation.

★1.1k stars Python Image · Video · Audio

View on GitHub ↗

Velocity · 7d

+2.5

★ / day

Trend

→steady

star history

The project implements a transformer-based autoregressive decoder that generates images token-by-token, trained from scratch on a unified framework. It handles a spectrum of image generation tasks including text-to-image generation, image pair generation, subject-driven generation, multi-turn image editing, controllable generation, and dense prediction. Model checkpoints and inference code are publicly available on HuggingFace.