← all repositories

openai/CLIP

OpenAI's CLIP is a multimodal neural network trained on image-text pairs that performs zero-shot image classification given natural language queries.

33.7k stars Jupyter Notebook Language ModelsImage · Video · Audio
CLIP
Velocity · 7d
+17
★ / day
Trend
steady
star history

CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on diverse image-text pairs that can predict relevant text snippets for any given image without task-specific fine-tuning. The model learns visual concepts from natural language descriptions, enabling zero-shot transfer to downstream tasks. It achieves competitive accuracy with ResNet50 on ImageNet without using any of the original labeled training examples.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.