superlinear-ai/raglite
A Python RAG toolkit with DuckDB or PostgreSQL supporting multiple LLM providers, rerankers, and late chunking embeddings.

RAGLite provides a complete pipeline for retrieval-augmented generation workflows. It supports multiple LLM providers via LiteLLM including local llama-cpp-python models, stores and searches vectors using DuckDB or PostgreSQL with pgvector, and ranks results with any reranker. The toolkit includes PDF-to-markdown conversion, multi-vector late chunking for improved embeddings, and hardware acceleration via Metal on macOS and CUDA on Linux/Windows.