cloneofsimo/paint-with-words-sd
A Stable Diffusion implementation of the paint-with-words technique from eDiff-I that lets users generate images from text-labeled segmentation maps.

This repository implements the paint-with-words method from NVIDIA’s eDiff-I paper using Stable Diffusion. The technique allows users to control image generation by providing a segmentation map where each colored region is associated with a text label. The implementation modifies cross-attention scores to spatially guide the diffusion process, enabling fine-grained control over object placement without changing the seed. It demonstrates before/after comparisons showing how objects like clouds or roads can be selectively included or positioned.