Is GroundingDINO open source?

Yes — IDEA-Research/GroundingDINO is open source, released under the Apache-2.0 license.

What language is GroundingDINO written in?

IDEA-Research/GroundingDINO is primarily written in Python.

How popular is GroundingDINO?

IDEA-Research/GroundingDINO has 10.4k stars on GitHub.

Where can I find GroundingDINO?

IDEA-Research/GroundingDINO is on GitHub at https://github.com/IDEA-Research/GroundingDINO.

← all repositories

IDEA-Research/GroundingDINO

Object detection that reads English

A vision-language detector that finds objects in images using plain text prompts, no retraining required.

★10.4k stars Python Computer Vision

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Grounding DINO takes an image and a text prompt—say, “two dogs with a stick”—and returns bounding boxes for the objects you described. It was trained to align visual features with language, so it generalizes to categories it never saw during training. The repo provides PyTorch inference code and pretrained checkpoints.

The interesting bit

The model outputs 900 candidate boxes per image, then scores each box against every word in your prompt. You threshold both the box confidence and the text-word similarity to decide what counts as a match. This means you can fish for specific phrases (“dogs” but not “stick”) without retraining the network.

Key highlights

Zero-shot COCO performance of 52.5 AP without training on COCO; 63.0 AP with fine-tuning
CPU-only inference supported since March 2023
Hugging Face integration available; also Colab demos and a web demo
Ecosystem includes Grounded SAM (segmentation), Stable Diffusion image editing, and GLIGEN integration
Training code not yet released (still on the TODO list)

Caveats

Installation is finicky: missing CUDA_HOME causes a cryptic NameError: name '_C' is not defined that requires a full reinstall
The README warns that tokenizer behavior varies—word count ≠ token count, and category names should be separated by periods for best results

Verdict

Worth a look if you need flexible, promptable detection without curating a fixed class list. Skip it if you need training code today or want a polished pip-install experience.

Frequently asked

What is IDEA-Research/GroundingDINO?: A vision-language detector that finds objects in images using plain text prompts, no retraining required.
Is GroundingDINO open source?: Yes — IDEA-Research/GroundingDINO is open source, released under the Apache-2.0 license.
What language is GroundingDINO written in?: IDEA-Research/GroundingDINO is primarily written in Python.
How popular is GroundingDINO?: IDEA-Research/GroundingDINO has 10.4k stars on GitHub.
Where can I find GroundingDINO?: IDEA-Research/GroundingDINO is on GitHub at https://github.com/IDEA-Research/GroundingDINO.