OpenBMB/VisRAG
A parsing-free RAG system that leverages vision-language models for visual document retrieval and multi-image reasoning.

VisRAG 2.0 is a retrieval-augmented generation system designed for visual documents that operates without traditional text parsing. It uses vision-language models to directly retrieve and reason over visual content including images and documents. The system includes specialized retrieval models (VisRAG-Ret) and generation models (EVisRAG), enabling evidence-guided multi-image reasoning for visual question answering and document understanding tasks.