linzhiqiu/t2v_metrics
A Python evaluation framework for measuring how well text-to-visual generation models align with textual prompts using VQAScore.

Velocity · 7d
+0.6
★ / day
Trend
→steady
star history
VQAScore provides automated evaluation for text-to-image, text-to-video, and text-to-3D generation models using a vision-language approach. The system converts evaluation into an image-to-text VQA task, scoring model outputs by their alignment with input text descriptions. The repository includes benchmark tools like GenAI-Bench for compositional text-to-visual evaluation and supports models from the CLIP-FlanT5 Model Zoo.