← all repositories
snexus/llm-search

RAG with a YAML file and a healthy distrust of simple vector search

A local document search system that layers on every retrieval trick in the book—hybrid search, re-ranking, HyDE, multi-query—then exposes it all through an MCP server for your IDE.

658 stars Jupyter Notebook RAG · SearchAgentsCoding Assistants
llm-search
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does

pyLLMSearch ingests your local documents (PDFs, markdown, Word files, and more) and turns them into a queryable knowledge base. You configure it with YAML, point it at a folder, and ask questions via a Streamlit UI, a FastAPI endpoint, or an MCP server that plugs into Cursor, Windsurf, or VS Code Copilot.

The interesting bit

The author clearly thinks basic RAG is a starting point, not a destination. The system stacks improvements like a paranoid researcher: SPLADE sparse embeddings mixed with dense vectors, cross-encoder re-ranking, HyDE for when you don’t know the jargon yet, and multi-query expansion inspired by RAG Fusion. Even the markdown parser gets special treatment—splitting on headings and cleaning image links instead of dumb chunking.

Key highlights

  • Incremental indexing: add documents without rebuilding the entire embedding store
  • Hybrid search: dense (ChromaDB) + sparse (SPLADE) with re-ranking via bge-reranker-v2-m3 or zerank-2
  • HyDE support with an honest warning that it “significantly alters” result quality
  • MCP server via SSE, so your IDE can query your document base directly
  • Table parsing via gmft or Azure Document Intelligence; optional image parsing via Gemini
  • Chat history with question contextualization
  • LiteLLM + Ollama compatibility for local model hosting

Caveats

  • The README warns that HyDE can backfire if you don’t understand what you’re enabling
  • “Tested on up to few gigabytes” suggests you shouldn’t throw enterprise-scale archives at it blindly
  • Jupyter Notebook as the repo language hints the core may be research-grade Python rather than a polished package

Verdict

Worth a look if you’re building a personal or team knowledge base and want retrieval improvements without writing the glue yourself. Skip if you need a managed, scalable enterprise solution—this is a configurable toolkit, not a service.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.