poloclub/diffusiondb
A large-scale dataset of 14 million images generated by Stable Diffusion with real user prompts and hyperparameters for AI research.

Velocity · 7d
+1.0
★ / day
Trend
→steady
star history
DiffusionDB is a dataset containing 14 million text-to-image pairs generated by Stable Diffusion from prompts and hyperparameters specified by real users. It is designed to support research on understanding the interplay between prompts and generative models, detecting deepfakes, and designing human-AI interaction tools. The dataset is available in two subsets (2M and 14M images) on Hugging Face with accompanying metadata.