← all repositories

lumina-ai-inc/chunkr

A Rust-based document processing service that uses vision-language models for layout analysis, OCR, and semantic chunking to prepare documents for RAG pipelines.

2.9k stars Rust RAG · SearchData Tooling
chunkr
Velocity · 7d
+4.5
★ / day
Trend
steady
star history

Chunkr is an open-source document intelligence API designed to preprocess complex documents for retrieval-augmented generation systems. It performs layout analysis to identify document structure, applies OCR with bounding box extraction for scanned content, and generates semantically coherent chunks suitable for vector storage and retrieval. The service integrates vision-language model processing to handle complex visual elements within documents.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.