EvolvingLMMs-Lab/lmms-eval
Unified evaluation toolkit for benchmarking multimodal large language models and vision-language models across diverse task types.

Velocity · 7d
+5.1
★ / day
Trend
→steady
star history
LMMs-Eval is a comprehensive benchmarking framework for evaluating multimodal AI models including VLMs and LLMs. It provides standardized benchmarks across text, image, video, and audio modalities with support for over 100 evaluation tasks and 30+ models. The toolkit is designed to systematically probe and measure model capabilities in real-world scenarios.