← all repositories

illuin-tech/colpali

A library for training and running visual document retrievers using Vision Language Models based on ColBERT-style multi-vector embeddings.

colpali
Velocity · 7d
+3.7
★ / day
Trend
steady
star history

This repository implements ColPali and related ColVision models for document retrieval. It leverages Vision Language Models like PaliGemma to create multi-vector embeddings directly in the visual space, avoiding the need for OCR or text extraction. The models support training and inference for efficient visual document retrieval, with variants including ColQwen2 and ColSmol. It is associated with the ViDoRe benchmark for evaluating retrieval systems.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.