← all repositories
nomic-ai/nomic

A Python SDK that turns your embeddings into explorable maps

This is the official client for Nomic Atlas, a managed platform that projects high-dimensional data into shareable, browser-based visualizations.

1.9k stars Python RAG · SearchData Tooling
nomic
Velocity · 7d
+1.3
★ / day
Trend
steady
star history

What it does

The nomic Python package is a thin client for Nomic Atlas, a hosted service that ingests embeddings (or raw text/images/audio/video), projects them into 2-D, and renders them as interactive web maps. You upload vectors, Atlas handles storage, search, clustering, and deduplication. The browser becomes your data-exploration interface.

The interesting bit

Atlas auto-generates a hierarchical topic model from your embeddings’ latent structure — not from manual tags — and exposes it as a pandas DataFrame. The README’s example shows news headlines clustered into topics like “Oil Prices → mergers and acquisitions” without any supervised labeling. It’s a neat trick: semantic organization derived purely from vector geometry.

Key highlights

  • Supports text, image, audio, and video modalities
  • Built-in semantic search with nearest-neighbor retrieval
  • Automatic topic clustering at multiple depth levels
  • Deduplication across all supported data types
  • Embeddable, shareable maps accessible without coding
  • Same team behind GPT4All

Caveats

  • Requires a Nomic account and API token; not self-hostable
  • The README is vague on pricing, rate limits, and whether embeddings are generated locally or remotely
  • Heavy reliance on the hosted platform means this SDK is essentially glue code with limited standalone utility

Verdict

Worth a look if you need to present or explore large unstructured datasets to non-technical stakeholders. Skip it if you want full control over your vector pipeline or need to keep everything on-premise.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.