← all repositories
pathwaycom/llm-app

RAG templates that actually stay current when your docs change

A collection of Docker-ready LLM app templates built on a Rust-backed streaming engine, designed to keep indexes in sync with live data sources.

59.4k stars Jupyter Notebook RAG · SearchLLMOps · EvalApp Builders
llm-app
Velocity · 7d
+56
★ / day
Trend
steady
star history

What it does

This repo ships ready-to-run templates for RAG and enterprise search pipelines that re-index automatically as your data changes. Connectors watch Google Drive, SharePoint, S3, Kafka, PostgreSQL, local files, and real-time APIs; new documents, edits, and deletions propagate through without manual refreshes. Each template exposes an HTTP API and optionally a Streamlit UI, packaged for Docker deployment anywhere.

The interesting bit

The templates sit on top of the Pathway framework, a Python library with a Rust engine that handles both the streaming data sync and request serving. That lets them replace the usual Frankenstein stack—vector DB, cache, API framework—with built-in in-memory indexes (vector via usearch, full-text via Tantivy) that update live. One README claims switching from a vector index to hybrid search is a one-line change; if true, that’s the kind of simplification that usually costs vendor lock-in, but here it’s open source.

Key highlights

  • Seven distinct templates, from basic Q&A RAG to multimodal GPT-4o parsing, private local stacks with Ollama/Mistral, and an adaptive RAG mode that claims up to 4× token cost reduction
  • Built-in connectors for enterprise sources (SharePoint, Google Drive, S3, Kafka, PostgreSQL) with no separate infrastructure to provision
  • Scales to “millions of pages” per the README; indexing is in-memory with cache
  • Optional Streamlit UIs for quick demos; otherwise just HTTP APIs to wire into your own frontend
  • Integrates as a retriever backend for LangChain or LlamaIndex if you already have a frontend

Caveats

  • The “4× token cost reduction” and “one-line change” claims are stated but not benchmarked in the README; you’ll need to verify on your data
  • All indexing is in-memory, so RAM limits apply at scale—no mention of spill-to-disk or distributed mode in the sources
  • The repo is templates and examples, not a single turnkey product; expect to read individual template READMEs and adapt

Verdict

Worth a look if you’re tired of writing glue code to keep vector stores in sync with living document sources, especially in enterprise environments. Skip it if you need a managed hosted service or if your workloads already run smoothly on a static-batch RAG pipeline you don’t mind re-running manually.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.