megvii-research/MOTRv2
End-to-end multi-object tracking system that combines transformer architecture with pretrained object detectors for video-based tracking.

Velocity · 7d
+0.4
★ / day
Trend
→steady
collecting data…
star history
MOTRv2 is a computer vision system for multi-object tracking in video. It bootstraps an end-to-end tracker using a pretrained object detector (YOLOX) to generate detection proposals as anchors, improving upon prior end-to-end approaches. The system uses transformer-based architecture and achieves state-of-the-art performance on large-scale benchmarks like DanceTrack (73.4% HOTA) and BDD100K.