HKUDS/VideoRAG
A retrieval-augmented generation system enabling natural language chat with video content using multimodal LLMs.

Velocity · 7d
+6.2
★ / day
Trend
→steady
star history
VideoRAG is a research system published at KDD'2026 that enables users to have conversational interactions with video content. It applies retrieval-augmented generation techniques to long video understanding by using multimodal large language models to process video frames, retrieve relevant segments, and generate answers to user queries about video content. The system is available as Vimo Desktop for practical use.