← all repositories

google-research/maxvit

Multi-axis vision transformer model for image classification, detection, and segmentation tasks.

499 stars Jupyter Notebook Computer VisionML Frameworks
maxvit
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

This is the official TensorFlow implementation of MaxViT, a multi-axis vision transformer published at ECCV 2022. It provides state-of-the-art foundation models for image classification, object detection, semantic segmentation, image quality assessment, and generative modeling tasks. The architecture combines dilated local attention with grid attention across both spatial axes.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.