AnswerDotAI/byaldi
A Python library providing a simplified API for multi-modal late-interaction retrieval using ColPali and ColQwen2 models.

Byaldi wraps ColPali and related multi-modal models to enable easy retrieval over complex documents with embedded images. It converts PDFs to images and indexes them using late-interaction embedding techniques, allowing developers to perform semantic search across multi-modal content with just a few lines of code. The library is built on the colpali-engine and aims to support all ColVLM models, with plans for HNSW indexing and quantization optimizations.