Is natural-language-image-search open source?

Yes — haltakov/natural-language-image-search is open source, released under the MIT license.

What language is natural-language-image-search written in?

haltakov/natural-language-image-search is primarily written in Jupyter Notebook.

How popular is natural-language-image-search?

haltakov/natural-language-image-search has 1k stars on GitHub.

Where can I find natural-language-image-search?

haltakov/natural-language-image-search is on GitHub at https://github.com/haltakov/natural-language-image-search.

← all repositories

haltakov/natural-language-image-search

CLIP + 2 million Unsplash photos = search by vibes

A notebook pipeline that lets you find photos with phrases like "the feeling when your program finally works."

★1k stars Jupyter Notebook RAG · Search Computer Vision

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

This repo is a Jupyter notebook pipeline that embeds nearly 2 million Unsplash photos using OpenAI’s CLIP model, then lets you search them with natural language queries. You type a sentence; CLIP encodes both your text and the images into the same vector space, and cosine similarity does the rest. There’s also a Colab notebook if you just want to play without downloading gigabytes of photos.

The interesting bit

The project demonstrates CLIP’s ability to handle genuinely fuzzy, emotional queries — “the feeling when your program finally works” returns relevant results despite having zero literal tags. That’s the latent-space magic: it isn’t matching keywords, it’s matching conceptual neighborhoods.

Key highlights

Pre-computed CLIP embeddings for ~2M Unsplash photos (full dataset, not just the public Lite version)
One-click Colab demo for query experimentation
Alternative notebook that filters Unsplash’s own Search API through CLIP re-ranking
Notebooks are numbered sequentially: setup → download → process → search

Caveats

The full Unsplash Dataset requires a (free) application; the Lite version is public but smaller
API-based search without the local dataset is supported but “will probably deliver worse results”
This is essentially a well-documented glue pipeline around CLIP and Unsplash data, not a novel model or production service

Verdict

Worth an hour if you’re building semantic search, need a CLIP-on-images reference implementation, or want to demo vector search to a skeptical team. Skip it if you need a hosted API or are already running your own image embedding pipeline.

Frequently asked

What is haltakov/natural-language-image-search?: A notebook pipeline that lets you find photos with phrases like "the feeling when your program finally works."
Is natural-language-image-search open source?: Yes — haltakov/natural-language-image-search is open source, released under the MIT license.
What language is natural-language-image-search written in?: haltakov/natural-language-image-search is primarily written in Jupyter Notebook.
How popular is natural-language-image-search?: haltakov/natural-language-image-search has 1k stars on GitHub.
Where can I find natural-language-image-search?: haltakov/natural-language-image-search is on GitHub at https://github.com/haltakov/natural-language-image-search.