← all repositories

NVlabs/VoxFormer

VoxFormer is a PyTorch implementation of a vision transformer for predicting 3D semantic occupancy from 2D camera images.

1.2k stars Python Computer VisionDomain Apps
VoxFormer
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

VoxFormer is a sparse voxel transformer that converts 2D camera inputs into 3D semantic occupancy predictions, enabling scene understanding for autonomous vehicles. It leverages deformable attention mechanisms and transformer architectures to achieve state-of-the-art results on SemanticKITTI and other benchmarks. The project includes model implementations, training scripts, and evaluation tools for 3D semantic scene completion tasks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.