← all repositories

NVIDIA/NeMo-Retriever

NVIDIA's document extraction and embedding pipeline for retrieval-augmented generation and generative AI applications.

2.9k stars Python RAG · SearchData Tooling
NeMo-Retriever
Velocity · 7d
+4.5
★ / day
Trend
steady
star history

NeMo Retriever Library extracts text, tables, charts, and infographics from documents using OCR and classification, then computes vector embeddings for the extracted content and stores them in LanceDB for downstream generative AI and RAG applications. It leverages NVIDIA NIM microservices for scalable, production-grade document processing and can be deployed on Kubernetes using Helm charts.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.