Open Assistant shipped, then shut down: what 37K stars left behind
A crowdsourced attempt to build a ChatGPT rival wrapped up in October 2023, leaving a dataset and a lot of infrastructure code.

What it does
Open Assistant was LAION’s bid to build an open, chat-based large language model that could handle tasks, call APIs, and fetch information dynamically. The project ran a public data-collection site where volunteers wrote prompts, ranked model responses, and labeled data to feed a classic RLHF pipeline: collect instruction-following samples, train a reward model on human rankings, then run reinforcement learning against it. The full stack—Next.js frontend, Python backend, Postgres, Docker Compose profiles for CI and inference—was designed to be runnable locally, though the README repeatedly warns that local setup is for development, not for running your own chatbot.
The interesting bit
The project treated crowdsourcing as a first-class engineering problem, not just a data-gathering step. They built leaderboards, anti-spam measures, and multi-vote ranking systems to handle “unreliable potentially malicious users.” That infrastructure for distributed human feedback may outlive the model itself.
Key highlights
- Followed the three-stage InstructGPT/RLHF recipe explicitly: supervised fine-tuning data → reward model → RL training
- Shipped a working chat frontend and data-collection web app at
open-assistant.io - Published the final
oasst2dataset on HuggingFace; this appears to be the main surviving artifact - Docker Compose setup with multiple profiles (
ci,inference) for local development - Devcontainer and GitHub Codespaces support for standardized contributor environments
Caveats
- The project is explicitly finished as of October 2023; no ongoing development
- The README’s “vision” section promises future capabilities (API usage, dynamic research, consumer hardware efficiency) that were aspirational and, given the shutdown, unfulfilled
- Local inference setup is documented but described as “only for development and is not meant to be used as a local chatbot, unless you know what you are doing”
Verdict
Worth studying if you’re building crowdsourced data pipelines or RLHF infrastructure, or if you want to mine the oasst2 dataset. Skip it if you’re looking for a maintained open assistant to deploy today—this is a finished experiment, not a living project.