← all repositories

Atten4Vis/ConditionalDETR

A transformer-based object detection model that achieves 6.7-10x faster training convergence than standard DETR on COCO.

404 stars Python Computer Vision
ConditionalDETR
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

This repository implements Conditional DETR, an object detection model that modifies the transformer decoder’s cross-attention mechanism. The key innovation is a conditional spatial query that narrows each attention head to a specific image region, reducing dependence on high-quality content embeddings and easing training difficulty. The model integrates with Huggingface Transformers and achieves state-of-the-art convergence speeds on COCO 2017 validation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.