← all repositories

mbzuai-oryx/Video-ChatGPT

Video-ChatGPT is a vision-language model that enables conversational interaction about videos by combining a pretrained video encoder with large language models.

Video-ChatGPT
Velocity · 7d
+1.3
★ / day
Trend
steady
star history

The model generates meaningful conversations about video content by integrating spatiotemporal video representations from a visual encoder with the reasoning capabilities of LLMs. It was published at ACL 2024 and introduces rigorous quantitative benchmarking (VCGBench-Diverse) specifically designed for evaluating video-based conversational models across diverse dimensions. The system supports zero-shot question answering on video datasets.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.