YOLO without the NMS hangover
A real-time object detector that finally ditches non-maximum suppression to cut latency and simplify deployment.
What it does YOLOv10 is a real-time object detector that removes the usual post-processing bottleneck. It trains with dual label assignments so the model learns to suppress duplicate boxes itself, eliminating the need for NMS at inference time. The authors also re-examined every YOLO component for wasted computation—trimming redundancy while keeping accuracy.
The interesting bit The NMS-free approach is the headline, but the quieter work is the “holistic efficiency-accuracy driven” architecture search. They optimized components jointly rather than swapping one block at a time, which is how YOLOv10-S ends up 1.8× faster than RT-DETR-R18 with 2.8× fewer parameters and similar COCO AP.
Key highlights
- Six model scales from 2.3M to 29.5M parameters, latency 1.84ms–10.70ms on COCO val
- YOLOv10-B: 46% less latency and 25% fewer params than YOLOv9-C for same AP
- Exported formats (ONNX, OpenVINO, etc.) required for fair speed benchmarks—raw PyTorch runs extra ops that skew timing
- Active ecosystem: Transformers.js web demo, Jetson Docker image, DeepSORT/ByteTrack integrations, RK3588 and OpenVINO C++ ports
- NeurIPS 2024; official PyTorch implementation with Gradio demo included
Caveats
- The README now leads with YOLOE, their newer open-vocabulary follow-up; YOLOv10 feels slightly archived
- Small or distant objects may need tuning; the authors link to community clarifications but don’t detail fixes in-repo
Verdict Worth a look if you’re deploying YOLO on edge hardware and NMS has been your profiling surprise. Skip if you need open-vocabulary detection—check their YOLOE repo instead.