← all repositories

OpenGVLab/InternVL

A family of open-source multimodal large language models supporting vision-language tasks such as image classification, semantic segmentation, and video understanding.

InternVL
Velocity · 7d
+11
★ / day
Trend
steady
star history

InternVL is a research project providing open-source multimodal models that compete with commercial systems like GPT-4o. It supports visual question answering, image-text retrieval, semantic segmentation, and video classification through a vision-language architecture combining ViT encoders with large language models. The project offers model weights, training code, inference tools, and a chat demo.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.