← all repositories

X-PLUG/mPLUG-Owl

A family of multi-modal large language models that process images, video, and text for visual recognition and dialogue tasks.

mPLUG-Owl
Velocity · 7d
+2.2
★ / day
Trend
steady
star history

mPLUG-Owl is a modular multi-modal LLM family supporting image and video understanding. The series spans three generations: mPLUG-Owl, mPLUG-Owl2 (CVPR 2024 Highlight), and mPLUG-Owl3 for long image-sequence comprehension. Models are implemented in PyTorch and distributed via HuggingFace with training and evaluation code provided.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.