google-research-datasets/Objectron
Google Research dataset of 15K annotated video clips and 4M images with 3D bounding boxes for 9 object categories.

Objectron is a large-scale dataset of short, object-centric video clips with manually annotated 3D bounding boxes describing object position, orientation, and dimensions. The dataset includes AR session metadata such as camera poses, sparse point clouds, and plane characterization. It is designed to support training and evaluation of 3D object detection and computer vision models, with pre-trained models released in MediaPipe for a subset of object categories.