← all repositories

TencentQQGYLab/ELLA

ELLA equips diffusion models with large language models to improve semantic alignment in text-to-image generation.

ELLA
Velocity · 7d
+1.6
★ / day
Trend
steady
star history

ELLA is a research project that combines diffusion models with LLMs to enhance semantic alignment in image generation. The approach allows text-to-image diffusion models to better understand and follow complex text prompts by integrating large language model capabilities. The repository also includes EMMA, a related technique that enables text-to-image models to accept multi-modal prompts.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.