Is poolformer open source?

Yes — sail-sg/poolformer is open source, released under the Apache-2.0 license.

What language is poolformer written in?

sail-sg/poolformer is primarily written in Python.

How popular is poolformer?

sail-sg/poolformer has 1.4k stars on GitHub.

Where can I find poolformer?

sail-sg/poolformer is on GitHub at https://github.com/sail-sg/poolformer.

← all repositories

sail-sg/poolformer

This vision model swaps attention for pooling and still beats DeiT

It tests whether vision transformers actually need self-attention, or if the generic MetaFormer backbone is doing the real work.

★1.4k stars Python Computer Vision ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does PoolFormer is the reference PyTorch implementation for a CVPR 2022 paper that questions the necessity of self-attention in vision transformers. It implements a family of image-classification models that replace the standard attention layer with a simple non-parametric spatial-pooling operator, yet still hit competitive ImageNet-1K scores. The repository also bundles configs and pretrained weights for downstream detection and segmentation tasks on COCO and ADE20K.

The interesting bit The authors are not chasing state-of-the-art; they are making a structural argument. By dropping a basic pooling layer into the standard transformer block—no learned weights, no quadratic complexity—they demonstrate that the MetaFormer skeleton itself carries most of the model’s representational power. It is a controlled experiment disguised as a codebase.

Key highlights

Five model sizes (S12 to M48) scaling from 12M to 73M parameters, with top-1 ImageNet accuracy ranging from 77.2% to 82.5%.
The token mixer is literally pooling: no attention heads, no MLP mixing, just a basic non-parametric operation.
Includes pretrained checkpoints, Grad-CAM visualization scripts, and MAC-counting utilities.
Detection and segmentation configs are built on top of MMDetection and MMSegmentation.
The authors explicitly note that pooling is merely a tool to support their claim about architecture versus operators.

Verdict Grab this if you are researching vision backbones or want a lightweight, attention-free baseline that still punches above its weight. Skip it if you are looking for a production-ready library; this is a research artifact with follow-up work now living in the broader MetaFormer repository.

Frequently asked

What is sail-sg/poolformer?: It tests whether vision transformers actually need self-attention, or if the generic MetaFormer backbone is doing the real work.
Is poolformer open source?: Yes — sail-sg/poolformer is open source, released under the Apache-2.0 license.
What language is poolformer written in?: sail-sg/poolformer is primarily written in Python.
How popular is poolformer?: sail-sg/poolformer has 1.4k stars on GitHub.
Where can I find poolformer?: sail-sg/poolformer is on GitHub at https://github.com/sail-sg/poolformer.