A Redis tutorial wearing a lab coat
A deliberately simple RAG demo that fetches arXiv papers, embeds them, and answers questions via Redis vector search.

What it does
ArXiv ChatGuru is a Streamlit app that turns an arXiv topic into a searchable knowledge base. You pick a subject and paper count; it fetches from arXiv, chunks the PDFs, generates OpenAI embeddings, and stores everything in a Redis vector index. Questions get answered by retrieving the closest chunks and feeding them to a chat model via LangChain.
The interesting bit
The README is admirably honest: this is “intentionally simple” and “a learning project,” not a production research assistant. That clarity is refreshing. The stats page that exposes Redis index metadata and query engine stats is a nice touch for understanding what’s actually happening under the hood.
Key highlights
- Topic-scoped Redis vector indexes keep different paper collections isolated
- Docker-first local setup with
make docker-up; local Python 3.13 + Poetry path also available - Built-in stats page to inspect index metadata, fields, and query engine performance
- Uses GPT-4.1-mini and text-embedding-3-small by default (configurable via
.env) - Clean Makefile with format, test, build, dev, and docker commands
Caveats
- Requires OpenAI API key; no local model fallback mentioned
- “Planned follow-ups” include basic features like year/author filters and chat history, suggesting the current version is pretty bare-bones
- No mention of rate limiting or cost controls for arXiv fetching or OpenAI calls
Verdict
Worth an hour if you’re learning RAG architecture and want to see Redis as a vector database in a complete, runnable pipeline. Skip it if you need a serious research tool or want to avoid OpenAI dependencies.