← all repositories

gligen/GLIGEN

A text-to-image diffusion model that grounds generation on spatial inputs like bounding boxes, keypoints, and reference images.

GLIGEN
Velocity · 7d
+1.8
★ / day
Trend
steady
star history

GLIGEN extends frozen text-to-image models to accept additional spatial conditioning inputs including bounding boxes, keypoints, and reference images. Published at CVPR 2023, it demonstrates zero-shot performance on COCO and LVIS benchmarks that exceeds supervised layout-to-image baselines. The project includes inference code and integration with Hugging Face Spaces for demos.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.