A chatbot kit that's already retired
A boilerplate for scraping your website, stuffing it into Supabase, and chatting with it via GPT—now archived and unmaintained.

What it does
This is a Next.js starter that scrapes your website, converts the text into OpenAI embeddings via text-embedding-ada-002, stores them in Supabase’s pgvector extension, and exposes a chat interface where users can ask questions against that content. The scraping logic lives in a custom Cheerio-based loader where you manually map CSS selectors to extract title, date, and content from pages.
The interesting bit
The project doubles as a tutorial vehicle—there’s a YouTube walkthrough and a visual guide folder—suggesting it was built more for teaching LangChain patterns than production use. The embedding pipeline is explicit: scrape, vectorize, store, then retrieve with match_documents.
Key highlights
- Uses
text-embedding-ada-002and Supabase’s pgvector for retrieval - Scraping script (
npm run scrape-embed) is decoupled from the app runtime - Requires manual CSS selector configuration per target site in
custom_web_loader.ts - Frontend borrowed from an earlier community project (
langchain-chat-nextjs) - Sample data comes from productivity blogger Thomas Frank’s Notion guides
Caveats
- Explicitly unmaintained: the README banner warns against expecting issue or PR responses
- No mention of rate limiting, caching, or auth—just the core plumbing
- Scraping logic is brittle by design; one CSS change on your site breaks extraction
Verdict
Good for understanding how LangChain + vector retrieval fits together in a Next.js context, especially if you pair it with the video tutorial. Skip it if you need something maintained, or if your content lives anywhere other than static web pages you control.