← all repositories

NVlabs/FasterViT

A vision transformer architecture with hierarchical attention mechanisms for efficient image understanding tasks.

916 stars Python Computer VisionML Frameworks
FasterViT
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

FasterViT is a PyTorch implementation of a hierarchical vision transformer that introduces Hierarchical Attention (HAT) to capture both short and long-range spatial information via cross-window carrier tokens. It serves as a strong backbone for various computer vision tasks including image classification on ImageNet, object detection on COCO, and semantic segmentation on ADE20K. The model achieves competitive accuracy-throughput tradeoffs on standard benchmarks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.