← all repositories

TonyLianLong/LLM-groundedDiffusion

Research project that uses LLMs as prompt parsers to enhance Stable Diffusion's ability to understand and generate images from complex text descriptions.

LLM-groundedDiffusion
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

LLM-grounded Diffusion (LMD) enhances text-to-image generation by using a Large Language Model to parse user prompts into structured intermediate representations (such as image layouts) before feeding them to Stable Diffusion. This approach improves the model’s ability to handle complex, compositional, and spatially specific prompts. The project includes training code, evaluation benchmarks, and has been integrated into the Hugging Face diffusers library as LMD+.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.