fundamentalvision/Deformable-DETR
Deformable DETR is a transformer-based object detection model that uses deformable attention to attend only to sparse sampling points, achieving better performance than standard DETR with 10× fewer training epochs.

This repository provides the official PyTorch implementation of Deformable DETR, a computer vision model that improves end-to-end object detection by replacing dense transformer attention with a novel sampling-based deformable attention mechanism. The model attends only to a small set of key sampling points around reference locations, which addresses the slow convergence and limited spatial resolution issues of the original DETR. It demonstrates particularly improved performance on small object detection and includes training/evaluation code for the COCO benchmark.