wanghao9610/OV-DINO
A unified open-vocabulary detection model that detects and segments objects based on free-form text descriptions using language-aware selective fusion.

Velocity · 7d
+0.6
★ / day
Trend
→steady
star history
OV-DINO is a foundation model designed for open-vocabulary detection and segmentation tasks. It employs language-aware selective fusion to enable zero-shot object detection, meaning it can detect and segment object categories it has never seen during training based on textual descriptions. The model achieves state-of-the-art results on benchmarks like MS COCO and LVIS for zero-shot object detection.