← all repositories
alibaba/zvec

Alibaba's vector DB that runs inside your process

An in-process vector database with WAL durability, built for embedding directly into apps without server overhead.

9.8k stars C++ RAG · Search
zvec
Velocity · 7d
+53
★ / day
Trend
steady
star history

What it does

Zvec is an in-process vector database in C++ that embeds directly into your application — no separate server, no network hop. It handles dense and sparse vectors, supports hybrid search with structured filters, and persists data via write-ahead logging so crashes don’t mean data loss. Python and Node.js bindings are available; Dart/Flutter joined the party in v0.4.0.

The interesting bit

The “in-process” part is the whole pitch. Alibaba battle-tested this internally, then shipped it as a library you pip install. The WAL durability layer is unusual for this category — most embedded vector stores treat persistence as an afterthought. Zvec also allows multi-process reads on the same collection, though writes stay single-process exclusive.

Key highlights

  • Searches billions of vectors in milliseconds (per README claims; benchmark methodology is documented separately)
  • Dense + sparse vectors with multi-vector queries in a single call
  • Hybrid search: combine semantic similarity with structured metadata filters
  • WAL-backed persistence survives process crashes and power failures
  • Cross-platform: Linux (x86_64, ARM64), macOS (ARM64), Windows (x86_64), plus iOS and Android via Flutter FFI
  • v0.4.0 (May 2026) added Dart/Flutter SDK, iOS builds, and fixed SQ8 quantizer recall issues

Caveats

  • Write access is single-process exclusive; you’ll need external coordination for concurrent writers
  • The “billions of vectors in milliseconds” claim lacks specific hardware context in the README
  • Windows path handling and sparse vector index ordering were recent bug fixes — worth monitoring

Verdict

Worth a look if you’re building RAG pipelines, local LLM memory, or edge AI and want vector search without operational baggage. Skip it if you need distributed writes or already run a managed vector service you’re happy with.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.