zai-org/ImageReward
A general-purpose reward model that learns human preferences for text-to-image generation and optimizes diffusion models through Reward Feedback Learning.

ImageReward is the first general-purpose text-to-image reward model trained on 137k expert comparison pairs. It outperforms CLIP, Aesthetic, and BLIP scoring methods in understanding human preference for image synthesis. The project also includes Reward Feedback Learning (ReFL) for directly optimizing Stable Diffusion using the learned reward model, where ReFL-tuned models win 58.4% more often in human evaluation. Both the reward model and ReFL are packaged as a Python library.