om-ai-lab/OmDet
OmDet-Turbo is a real-time open-vocabulary end-to-end object detection model supporting arbitrary text queries for zero-shot detection.

OmDet-Turbo is a computer vision model designed for open-vocabulary object detection, enabling detection of any object class specified by natural language without prior training. It uses a vision-language architecture to align visual features with text embeddings, allowing zero-shot detection across arbitrary categories. The model has been integrated into HuggingFace Transformers (v4.45.0) and supports ONNX export for deployment.