← all repositories
jina-ai/vectordb

Jina's vector database: Pythonic glue with a deployment story

A thin Python wrapper around DocArray and Jina that promises CRUD, sharding, and cloud deployment without the bloat.

649 stars Python RAG · Search
vectordb
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does

vectordb is a Python-native vector database built on two existing Jina AI projects: DocArray handles the search algorithms, and Jina handles serving and scaling. You define schemas with DocArray dataclasses, pick an index (exact nearest-neighbor or HNSW), then run it locally, as a gRPC/HTTP/websocket service, or deploy to Jina AI Cloud via a jc CLI.

The interesting bit

The pitch is “no more, no less” — and the README mostly delivers on that modesty. The unusual angle is the tight coupling with Jina’s ecosystem: RAFT-based replication for multi-replica setups, and a one-command cloud deploy (vectordb deploy --db example:db) that feels closer to a PaaS workflow than typical vector DB tooling.

Key highlights

  • CRUD operations (index, search, update, delete) share the same API across local and client-server modes
  • Two backend options: brute-force InMemoryExactNNVectorDB or approximate HNSWVectorDB (via hnswlib)
  • Serve locally with db.serve(protocol='grpc', ...) or deploy to Jina AI Cloud with the jc CLI
  • Sharding for latency; RAFT-based replication for throughput and availability
  • Cloud replication is currently pinned to 1 replica — the README notes this is “being worked on”

Caveats

  • Cloud deployments don’t yet support the replication feature that exists in local/self-hosted mode
  • The HNSW configuration docs are truncated mid-sentence in the README; ef_construction and other parameters are cut off

Verdict

Worth a look if you’re already in the Jina/DocArray ecosystem and want a unified path from laptop prototype to hosted service. If you’re committed to Milvus, Weaviate, or pgvector, this is probably too ecosystem-specific to displace your setup.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.