← all repositories
mayooear/langchain-supabase-website-chatbot

A chatbot kit that's already retired

A boilerplate for scraping your website, stuffing it into Supabase, and chatting with it via GPT—now archived and unmaintained.

langchain-supabase-website-chatbot
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does

This is a Next.js starter that scrapes your website, converts the text into OpenAI embeddings via text-embedding-ada-002, stores them in Supabase’s pgvector extension, and exposes a chat interface where users can ask questions against that content. The scraping logic lives in a custom Cheerio-based loader where you manually map CSS selectors to extract title, date, and content from pages.

The interesting bit

The project doubles as a tutorial vehicle—there’s a YouTube walkthrough and a visual guide folder—suggesting it was built more for teaching LangChain patterns than production use. The embedding pipeline is explicit: scrape, vectorize, store, then retrieve with match_documents.

Key highlights

  • Uses text-embedding-ada-002 and Supabase’s pgvector for retrieval
  • Scraping script (npm run scrape-embed) is decoupled from the app runtime
  • Requires manual CSS selector configuration per target site in custom_web_loader.ts
  • Frontend borrowed from an earlier community project (langchain-chat-nextjs)
  • Sample data comes from productivity blogger Thomas Frank’s Notion guides

Caveats

  • Explicitly unmaintained: the README banner warns against expecting issue or PR responses
  • No mention of rate limiting, caching, or auth—just the core plumbing
  • Scraping logic is brittle by design; one CSS change on your site breaks extraction

Verdict

Good for understanding how LangChain + vector retrieval fits together in a Next.js context, especially if you pair it with the video tutorial. Skip it if you need something maintained, or if your content lives anywhere other than static web pages you control.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.