← all repositories

lucidrains/x-clip

A PyTorch implementation of CLIP, a multi-modal model that learns to associate images with text using contrastive learning.

723 stars Python Language Models
x-clip
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

The repository provides a complete implementation of CLIP from OpenAI with additional experimental improvements from recent research papers. It includes support for fine-grained contrastive learning (FILIP), decoupled contrastive learning (DCL), extra latent projections (CLOOB), visual self-supervised learning, and masked language modeling (MLM) on text. The implementation allows configuring text and image encoders with customizable depth, heads, and patch sizes.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.