← all repositories
HKUDS/LightRAG

RAG with a graph brain, minus the graph PhD

LightRAG builds a knowledge graph from your documents but keeps retrieval fast by mixing low-level text chunks with high-level graph relationships.

36.3k stars Python RAG · SearchLLMOps · Eval
LightRAG
Velocity · 7d
+59
★ / day
Trend
steady
star history

What it does

LightRAG ingests documents, extracts entities and relationships into a knowledge graph, then answers queries by retrieving both specific text chunks and broader graph neighborhoods. It ships as a Python library with an optional server, WebUI, and Docker Compose setup. The project has accumulated 36K stars and a small constellation of related repos (MiniRAG, VideoRAG, RAG-Anything).

The interesting bit

Most graph RAG systems treat the knowledge graph as the main event; LightRAG treats it as a cheap index. The “dual-level retrieval” fetches exact matches and related entities/edges, so you get precision without losing context. The README claims this stays fast at scale, though the actual speed numbers live in the paper, not the repo.

Key highlights

  • Pluggable storage backends: PostgreSQL, MongoDB, Neo4j, OpenSearch, or plain JSON/vector files
  • Role-specific LLM configs: different models for extraction, querying, keyword generation, and vision tasks
  • Multimodal ingestion via merged RAG-Anything support (PDFs, images, tables, equations through MinerU/Docling)
  • Ollama-compatible API surface, plus a web dashboard for indexing and graph visualization
  • Setup wizard and signed Docker images for air-gapped or offline deployment

Caveats

  • The README is enthusiastic about features but vague on concrete benchmarks; “simple and fast” is the promise, not the proof
  • 36K stars with a 2026.05 feature list suggests rapid, possibly churn-heavy development
  • Front-end build requires bun and a separate Node toolchain; not a pure Python install

Verdict

Worth a look if you want graph-aware RAG without committing to Neo4j Cypher or massive infrastructure. Skip it if you need battle-tested simplicity—this is a research-backed project still accumulating features faster than it documents tradeoffs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.