← all repositories

wenhaochai/MovieChat

MovieChat is a multimodal LLM system that handles long video understanding by converting dense tokens to sparse memory for efficient processing on standard GPUs.

700 stars Python Language ModelsComputer Vision
MovieChat
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

MovieChat is a CVPR 2024 research paper and open-source implementation of a video understanding system that combines computer vision and language models. It uses a sparse memory mechanism to efficiently process videos with over 10K frames on a 24GB GPU, dramatically reducing memory overhead compared to traditional approaches. The system builds on LLaMA and includes a leaderboard (MovieChat-1K) for benchmarking long video understanding performance.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.