← all repositories

EvolvingLMMs-Lab/lmms-eval

Unified evaluation toolkit for benchmarking multimodal large language models and vision-language models across diverse task types.

4.2k stars Python LLMOps · Eval
lmms-eval
Velocity · 7d
+5.1
★ / day
Trend
steady
star history

LMMs-Eval is a comprehensive benchmarking framework for evaluating multimodal AI models including VLMs and LLMs. It provides standardized benchmarks across text, image, video, and audio modalities with support for over 100 evaluation tasks and 30+ models. The toolkit is designed to systematically probe and measure model capabilities in real-world scenarios.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.