SQLite-backed embeddings that load in 0.7 seconds and sip 18KB RAM
A Python library that turns multi-gigabyte word vectors into lazy-loaded, memory-mapped SQLite databases with out-of-vocabulary smarts.

What it does
Magnitude is a Python package and file format (.magnitude) for storing and querying vector embeddings. It converts models from word2vec, GloVe, fastText, and ELMo into SQLite databases with indexes and memory mapping, then wraps them in a Pythonic API. The goal is to be a lighter, faster alternative to Gensim for production use.
The interesting bit
The real trick is treating a 4GB embedding file like a memory-mapped database rather than loading it into RAM. Magnitude uses SQLite with spatial indexing, SIMD instructions, and LRU caching to serve vectors from disk with near-RAM speed. It also handles out-of-vocabulary keys by falling back to character n-gram similarity, which means misspellings and rare words don’t just return zero vectors.
Key highlights
- Lazy-loads models: initial load time is 0.72s for a 4.21GB file, with only 18KB RAM used at startup
- Warm single-key queries run in 0.04ms; even streaming over HTTP hits 0.4ms after first access
- Converts between word2vec, GloVe, fastText, and ELMo formats with a single utility
- Supports concatenating multiple embedding models and adding POS tag features
- Published at EMNLP 2018, so the approach has been peer-reviewed
Caveats
- First
most_similarsearch without a disk cache can take 247 seconds (subsequent queries drop to ~0.24s) - Google Colab installation requires a shell script workaround due to dependency conflicts
- The “Medium” and “Heavy” benchmark columns in the README are blank (marked with ━), so performance claims for those variants are unclear
Verdict
Worth a look if you’re running embedding queries in production and want to stop paying for RAM you don’t need. Less compelling if you’re doing heavy similarity search without the patience to warm the cache first, or if you’re already happy with Gensim’s in-memory performance.