Is similarity-search-kit open source?

Yes — ZachNagengast/similarity-search-kit is open source, released under the Apache-2.0 license.

What language is similarity-search-kit written in?

ZachNagengast/similarity-search-kit is primarily written in Swift.

How popular is similarity-search-kit?

ZachNagengast/similarity-search-kit has 533 stars on GitHub.

Where can I find similarity-search-kit?

ZachNagengast/similarity-search-kit is on GitHub at https://github.com/ZachNagengast/similarity-search-kit.

← all repositories

ZachNagengast/similarity-search-kit

Semantic search that never phones home

A Swift package for running text embeddings and vector search entirely on Apple devices, because not every document belongs on someone else's server.

★533 stars Swift RAG · Search Inference · Serving

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

SimilaritySearchKit lets you embed text and search by meaning on iOS and macOS without network calls. You initialize a SimilarityIndex with an embedding model and distance metric, feed it strings, then query for semantically similar results. It handles the model inference, vector storage, and similarity scoring locally using CoreML.

The interesting bit

The library ships with pre-converted CoreML versions of HuggingFace models (Distilbert, MiniLM variants) plus Apple’s built-in NaturalLanguage embedding, and the whole pipeline—embeddings, metrics, text splitting, tokenization, even vector storage—is protocol-driven so you can swap in custom implementations without touching the core search logic.

Key highlights

Four built-in embedding models ranging from 46 MB to 86 MB, including quantized options for Q&A and general similarity
Three distance metrics: dot product, cosine similarity, Euclidean distance
Disk-backed indexing for datasets too large for memory, with JSON-based storage swappable via VectorStoreProtocol
Example projects covering basic search, PDF semantic search, and a full “chat with your files” macOS app
Bring-your-own-model support through EmbeddingsProtocol and DistanceMetricProtocol

Caveats

Requires iOS 16.0+ or macOS 13.0+ for the examples; exact base requirements for the package itself aren’t specified
Future work is explicitly listed as incomplete: no HSNW/Annoy approximate indexing yet, no query filters by metadata, no Metal acceleration for distance calculations
The README notes “all around performance improvements” are still pending

Verdict

Worth a look if you’re building privacy-sensitive or offline-first NLP features in Swift and want to avoid the complexity of self-hosting embedding services. Less compelling if you need production-grade approximate nearest-neighbor search at massive scale today—this is still brute-force or basic disk-backed indexing.

Frequently asked

What is ZachNagengast/similarity-search-kit?: A Swift package for running text embeddings and vector search entirely on Apple devices, because not every document belongs on someone else's server.
Is similarity-search-kit open source?: Yes — ZachNagengast/similarity-search-kit is open source, released under the Apache-2.0 license.
What language is similarity-search-kit written in?: ZachNagengast/similarity-search-kit is primarily written in Swift.
How popular is similarity-search-kit?: ZachNagengast/similarity-search-kit has 533 stars on GitHub.
Where can I find similarity-search-kit?: ZachNagengast/similarity-search-kit is on GitHub at https://github.com/ZachNagengast/similarity-search-kit.