Is ODISE open source?

Yes — NVlabs/ODISE is an open-source project tracked on heatdrop.

What language is ODISE written in?

NVlabs/ODISE is primarily written in Python.

How popular is ODISE?

NVlabs/ODISE has 945 stars on GitHub.

Where can I find ODISE?

NVlabs/ODISE is on GitHub at https://github.com/NVlabs/ODISE.

← all repositories

NVlabs/ODISE

Diffusion models moonlight as open-vocabulary segmentation engines

ODISE repurposes frozen text-to-image diffusion models and CLIP to perform panoptic segmentation on arbitrary categories without task-specific training data.

★945 stars Python Computer Vision Image · Video · Audio

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does ODISE performs panoptic segmentation of images based on free-form text descriptions, segmenting categories it never saw during training. It keeps Stable Diffusion and CLIP frozen, training a lightweight 28.1-million-parameter head to decode their pre-trained representations into pixel-level masks and instance boundaries.

The interesting bit Instead of building a massive vision backbone from scratch, ODISE treats the diffusion model’s internal features as a spatial semantic atlas. The bet is that a generator trained to synthesize images from text already knows where objects belong; ODISE just asks it to point instead of paint.

Key highlights

Open-vocabulary panoptic segmentation across arbitrary text-described categories
Relies entirely on frozen Stable Diffusion and CLIP representations
Only 28.1M trainable parameters atop the frozen backbones
Released as a CVPR 2023 Highlight with pre-trained checkpoints and a Hugging Face demo
Weights are under a non-commercial CC BY-NC-SA 4.0 license

Caveats

Pre-trained weights are non-commercial (CC BY-NC-SA 4.0), so production use is restricted
First run automatically downloads Stable Diffusion and CLIP checkpoints, which carry their own separate license terms
The caption-supervised checkpoint trails the label-supervised variant by nearly 10 PQ points on COCO

Verdict A solid reference if you research open-world perception or wonder what diffusion models implicitly know about scene layout. Give it a pass if you need a commercially deployable, battle-hardened segmentation product.

Frequently asked

What is NVlabs/ODISE?: ODISE repurposes frozen text-to-image diffusion models and CLIP to perform panoptic segmentation on arbitrary categories without task-specific training data.
Is ODISE open source?: Yes — NVlabs/ODISE is an open-source project tracked on heatdrop.
What language is ODISE written in?: NVlabs/ODISE is primarily written in Python.
How popular is ODISE?: NVlabs/ODISE has 945 stars on GitHub.
Where can I find ODISE?: NVlabs/ODISE is on GitHub at https://github.com/NVlabs/ODISE.