← all repositories
chroma-core/chroma

Vector search that fits in a Python notebook

Chroma hides the embedding pipeline so you can prototype RAG without touching a model API first.

28.3k stars Rust RAG · SearchData Tooling
chroma
Velocity · 7d
+21
★ / day
Trend
steady
star history

What it does Chroma is an open-source vector database aimed at AI applications. You feed it text documents, and it handles tokenization, embedding, and indexing automatically. Query with natural language and get the closest matches back, optionally filtered by metadata or document content. It runs in-memory for quick experiments or persists to disk, and there is a client-server mode if you outgrow the local process.

The interesting bit The API surface is deliberately tiny—create a collection, add documents, query. That is the whole pitch. The project bets that most developers building retrieval-augmented generation do not want to orchestrate embedding models and vector indices before they know if the prototype will survive the week. Chroma abstracts that away, though you can bring your own embeddings if you prefer.

Key highlights

  • Core API is four functions: Client(), create_collection, add, query.
  • Auto-embeds documents or accepts pre-computed vectors.
  • Metadata and full-text filters on top of similarity search.
  • Python and JavaScript clients; server mode via chroma run.
  • Hosted cloud option (Chroma Cloud) with serverless vector, hybrid, and full-text search.
  • Apache 2.0, ~28k stars, releases on Mondays.

Caveats

  • The README notes a “row-based API coming soon,” so the current interface may shift.
  • “Extremely fast” and “painless” are claimed for the cloud tier, but no numbers or benchmarks are provided.

Verdict Good for developers who want to test RAG or semantic search without wiring up transformers and ANN libraries first. Less compelling if you already run a dedicated embedding pipeline and need fine-grained control over indexing or model choice.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.