← all repositories

facebookresearch/ImageBind

Multi-modal embedding model from Meta AI that aligns images, text, audio, depth, thermal, and IMU data into a unified embedding space.

ImageBind
Velocity · 7d
+7.7
★ / day
Trend
steady
star history

ImageBind is a PyTorch implementation of a multi-modal foundation model that learns a joint embedding space binding six different data modalities. The model enables emergent zero-shot classification across modalities and supports cross-modal retrieval, arithmetic composition of modalities, and cross-modal detection. Released with pretrained checkpoints, it was published as a CVPR 2023 highlighted paper.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.