sail-sg/poolformer
A PyTorch implementation of PoolFormer, a vision model using pooling-based token mixing instead of attention mechanisms.

Velocity · 7d
+0.8
★ / day
Trend
→steady
star history
This repository provides the official PyTorch implementation of PoolFormer from the CVPR 2022 paper MetaFormer Is Actually What You Need for Vision. The work argues that the success of vision transformers stems from their general MetaFormer architecture rather than specific token mixers. PoolFormer uses simple pooling operations as the token mixer, demonstrating competitive performance on image classification benchmarks like ImageNet-1K.