CLIP-powered grep for your meme hoard
A local tool that lets you type "a line drawing of a woman facing left" and actually find the image—no cloud, no metadata, no filenames required.

What it does
Memery indexes folders of images on your machine and lets you search them with plain English (or an example image, or both). It returns ranked results you can browse in a browser GUI, pipe through CLI tools, or call from Python code. Think of it as grep for visual memory, except you describe what you remember instead of remembering filenames.
The interesting bit
The project leans entirely on OpenAI’s CLIP model to embed images and text into the same latent space—no manual tagging, no OCR, no folder hierarchy required. The author built this to solve their own “10,000 memes and no filing system” problem, and the rough edges show: the GUI is Streamlit-based and “has some quirks,” folder paths must be typed manually, and GPU is the default install with CPU requiring a torch downgrade dance.
Key highlights
- Natural language + image-to-image search over local files, no upload to cloud services
- Three interfaces: browser GUI (
memery serve), CLI (memery recall/build/purge), and Python library (Memery.query_flow()) - Search time scales “well under O(n)” per the README, though the author notes it’s “not optimized for performance yet”
- Supports combined text-and-image queries (e.g., find things like this image but also matching this description)
- Works in Jupyter notebooks, including GUI functions
Caveats
- GPU default install; CPU-only requires manual PyTorch version pinning that may drift out of date
- GUI lacks a folder picker—users must type full directory paths into a text box
- The author flags the interface as “rough” with “some quirks” and notes that major errors appear as “giant stack traces” in the output panel
- Windows support is bat-file convenience; broader packaging is explicitly punted to “someday”
Verdict
Worth a look if you’ve got thousands of unsorted images locally and would rather describe what you see than maintain a taxonomy. Skip it if you need production polish, multi-user serving, or don’t have the GPU/VRAM to run CLIP embeddings comfortably.