Alpha-VLLM/Lumina-mGPT-2.0
A decoder-only autoregressive model for unified image generation tasks including text-to-image synthesis, image editing, and controllable generation.

The project implements a transformer-based autoregressive decoder that generates images token-by-token, trained from scratch on a unified framework. It handles a spectrum of image generation tasks including text-to-image generation, image pair generation, subject-driven generation, multi-turn image editing, controllable generation, and dense prediction. Model checkpoints and inference code are publicly available on HuggingFace.