← all repositories

zai-org/VisualGLM-6B

Open-source multimodal conversational language model supporting image, Chinese, and English dialogue with 7.8 billion parameters.

VisualGLM-6B
Velocity · 7d
+3.6
★ / day
Trend
steady
star history

VisualGLM-6B is a multimodal dialog language model combining text and image understanding. The language component derives from ChatGLM-6B with 6.2 billion parameters, while visual information is processed through a BLIP2-Qformer bridge trained to align visual representations with the language model, bringing the total model to 7.8 billion parameters. It supports bilingual Chinese-English conversation with image inputs, enabling users to discuss visual content in natural language. The project is built on the SwissArmyTransformer library for efficient model training and deployment.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.