← all repositories

OpenGVLab/Ask-Anything

Multi-modal LLM system enabling conversational video understanding through instruction-tuned video-to-language models.

Ask-Anything
Velocity · 7d
+2.9
★ / day
Trend
steady
star history

VideoChatGPT, highlighted at CVPR 2024, is a video understanding system that combines LLMs with visual processing for video-question-answering and captioning. It supports multiple foundation models including miniGPT4, StableLM, and MOSS, and provides instruction-tuning capabilities for video and image chatting. The system is built with Gradio for the UI and LangChain for orchestration, offering an end-to-end chatbot for video and image understanding.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.