OpenGVLab/Multi-Modality-Arena
A benchmarking platform for comparing large vision-language models side-by-side on visual question-answering tasks.

Velocity · 7d
+0.5
★ / day
Trend
→steady
star history
Multi-Modality Arena is an evaluation platform for large multimodal models, following the Chatbot Arena methodology. Two anonymous models are compared side-by-side on visual question-answering tasks. It supports a range of vision-language models including MiniGPT-4, LLaVA, BLIP-2, and LLaMA-Adapter V2. The platform includes evaluation benchmarks like OmniMedVQA for medical LVLMs and Tiny LVLM-eHub for rapid model comparison.