← all repositories

MME-Benchmarks/Video-MME

Comprehensive benchmark for evaluating multi-modal LLMs' video analysis capabilities across 9 domains with 3,000 human-labeled questions.

Video-MME
Velocity · 7d
+1.1
★ / day
Trend
steady
star history

Video-MME is the first comprehensive evaluation benchmark for assessing multi-modal LLMs on video understanding tasks. The benchmark covers 9 domains of video analysis including long-video comprehension and temporal reasoning, with 3,000 carefully curated questions with human annotations. It serves as a standardized benchmark for comparing multi-modal LLM performance in video understanding, with notable adoption by major AI labs for model evaluation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.