← all repositories

poloclub/diffusiondb

A large-scale dataset of 14 million images generated by Stable Diffusion with real user prompts and hyperparameters for AI research.

diffusiondb
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

DiffusionDB is a dataset containing 14 million text-to-image pairs generated by Stable Diffusion from prompts and hyperparameters specified by real users. It is designed to support research on understanding the interplay between prompts and generative models, detecting deepfakes, and designing human-AI interaction tools. The dataset is available in two subsets (2M and 14M images) on Hugging Face with accompanying metadata.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.