NVlabs/ODISE
Open-vocabulary panoptic segmentation system using frozen text-to-image diffusion and discriminative model representations.

Velocity · 7d
+0.8
★ / day
Trend
→steady
star history
ODISE performs panoptic segmentation of arbitrary categories by exploiting pre-trained text-to-image diffusion models and discriminative models. It extracts frozen representations from both model types and combines them to achieve zero-shot segmentation capabilities across any category. The approach enables panoptic segmentation in open-world settings without requiring training on specific classes.