← all repositories

NVlabs/GCVit

Global Context Vision Transformer (GC ViT) is a PyTorch vision transformer model for image classification, object detection, and semantic segmentation.

450 stars Python Computer Vision
GCVit
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

GC ViT introduces global context attention mechanisms to vision transformers, enabling efficient capture of long-range dependencies across images. The model achieves competitive performance on ImageNet classification, COCO object detection, and ADE20K semantic segmentation benchmarks. It provides pretrained checkpoints and training code as an official NVIDIA implementation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.