The vector database client that works offline
A Python SDK for Qdrant that lets you prototype with zero infrastructure, then connect to a real server when you're ready.

What it does
qdrant-client is the official Python SDK for the Qdrant vector search engine. It wraps the full REST and gRPC APIs with type hints, adds helper methods for common tasks like bulk uploads, and provides both sync and async clients. You can connect to a local server, Qdrant Cloud, or run entirely serverless.
The interesting bit
The local mode is the quiet killer feature. Initialize with QdrantClient(":memory:") or a file path and you get the same API surface without running a server—useful for CI pipelines, Jupyter notebooks, and early prototyping. When you outgrow it, swap the constructor arguments and point at a real cluster. The README also bundles an optional inference layer via FastEmbed, so you can pass raw text documents instead of pre-computed vectors if you install the extra.
Key highlights
- Local mode: in-memory or on-disk, no Docker required
- Async support across all methods since v1.6.1
- gRPC option for faster bulk uploads
- Optional FastEmbed integration for CPU/GPU embedding generation
- Full type coverage for the Qdrant API
Caveats
- The
fastembedandfastembed-gpuextras are mutually exclusive; switching requires a fresh environment - Remote inference via Qdrant Cloud is only available on paid plans
- The README warns against one-by-one point uploads due to request overhead
Verdict
Worth a look if you’re building with Qdrant and want frictionless local development. Skip it if you’re committed to another vector database or don’t need the offline-prototype-to-production path.