fundamentalvision/BEVFormer
BEVFormer is a camera-only autonomous driving perception framework using spatiotemporal Transformers for 3D object detection and semantic map segmentation.

Velocity · 7d
+2.9
★ / day
Trend
→steady
star history
BEVFormer is a Transformer-based camera-only framework that learns bird’s-eye-view representations from multi-camera images for autonomous driving. It performs 3D object detection and semantic map segmentation using spatiotemporal attention mechanisms. The model achieved state-of-the-art results on nuScenes (51.7% NDS) and won the Waymo Open Dataset 3D Camera-Only Detection Challenge in 2022.