The "hello world" of RAG apps, fully documented
A tutorial repo that wires Streamlit, LangChain, and OpenAI into the simplest possible PDF Q&A tool.

What it does Upload a PDF, type a question, get an answer drawn from the document. The app chunks the text, embeds it with OpenAI, retrieves relevant chunks via semantic search, and feeds them to an LLM. Classic RAG, no surprises.
The interesting bit The README is admirably honest: this is “for educational purposes only” and exists to support a YouTube tutorial. That transparency is rarer than it should be. The code is essentially glue — Streamlit for the GUI, LangChain for orchestration, OpenAI for both embeddings and generation — but the glue is clean and the explanation is thorough.
Key highlights
- Uses OpenAI embeddings + LLM for retrieval-augmented generation
- Streamlit one-liner GUI (
streamlit run app.py) - Explicitly scoped as tutorial material, not a maintained product
- Requires only
pip install -r requirements.txtand an OpenAI API key in.env - LLM is constrained to document-only answers (no hallucinating about your grocery list)
Caveats
- No candidate images provided, so no screenshot to preview the UI
- Hard dependency on OpenAI (no local model option mentioned)
- Not accepting contributions; don’t expect bug fixes or feature expansion
Verdict Grab this if you’re learning RAG and want a minimal, working reference to dissect. Skip it if you need production-grade PDF parsing, multi-model support, or an actively maintained codebase.