← all repositories

xxxnell/how-do-vits-work

PyTorch implementation of an ICLR 2022 paper analyzing how Vision Transformers work in computer vision.

820 stars Python Computer VisionML Frameworks
how-do-vits-work
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

This repository provides the official implementation of a peer-reviewed research paper studying the mechanics of Vision Transformers. It investigates how Multi-head Self-Attention (MSA) modules benefit neural networks, examining their role as spatial smoothings versus long-range dependency capturers. The work introduces AlterNet, a hybrid architecture combining CNNs and MSAs at stage ends, and provides analysis tools for loss landscapes and frequency responses of attention mechanisms.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.