QwenLM/Qwen-VL
A large vision-language model family from Alibaba that processes and reasons over images and text.

Velocity · 7d
+6.5
★ / day
Trend
→steady
star history
Qwen-VL is a pretrained and chat-optimized vision-language model that extends large language model capabilities to visual understanding. It supports image comprehension, multi-round dialogue, multilingual support, and comes in variants including base, chat, and quantized versions. The repository provides model weights, training code, and evaluation tools for the Qwen-VL model family.