Is DynamicViT open source?

Yes — raoyongming/DynamicViT is open source, released under the MIT license.

What language is DynamicViT written in?

raoyongming/DynamicViT is primarily written in Jupyter Notebook.

How popular is DynamicViT?

raoyongming/DynamicViT has 668 stars on GitHub.

Where can I find DynamicViT?

raoyongming/DynamicViT is on GitHub at https://github.com/raoyongming/DynamicViT.

← all repositories

raoyongming/DynamicViT

Vision Transformers That Skip the Boring Patches

It adds a dynamic token-pruning gate to vision transformers so the model spends compute only on image regions that actually matter for each input.

★668 stars Jupyter Notebook Computer Vision ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

DynamicViT prunes redundant tokens in vision transformers progressively and dynamically based on each input image. The same sparsification idea has since been extended to ConvNeXt and Swin Transformers. The authors report this reduces over 30% of FLOPs and improves throughput by more than 40% on ImageNet while keeping accuracy drops under 0.5%.

The interesting bit

Because the sparsification is input-dependent, the model can drop more tokens for simple images and retain them for complex ones. The T-PAMI extension applies the same logic to object detection and semantic segmentation, suggesting visual redundancy is not limited to classification.

Key highlights

Pretrained ImageNet models for DeiT, LVViT, ConvNeXt, and Swin with adjustable keep ratios.
Reported gains of >30% FLOPs reduction and >40% throughput improvement with <0.5% accuracy loss.
A Jupyter notebook and Colab demo to visualize which tokens are pruned.
Training configurations and checkpoints are provided for reproduction.

Caveats

The repository focuses on ImageNet classification; object detection and segmentation extensions are discussed in the T-PAMI paper but not obviously included here.
Full training recipes assume multi-GPU distributed setups; the README does not describe single-GPU training.
The FLOPs and throughput claims are paper figures; the repo lacks standalone benchmarking scripts to verify them independently.

Verdict

Useful if you want to speed up an existing ViT or ConvNeXt backbone without architecting a new compact model from scratch. Less compelling if you need ready-made support for detection, segmentation, or training on limited hardware.

Frequently asked

What is raoyongming/DynamicViT?: It adds a dynamic token-pruning gate to vision transformers so the model spends compute only on image regions that actually matter for each input.
Is DynamicViT open source?: Yes — raoyongming/DynamicViT is open source, released under the MIT license.
What language is DynamicViT written in?: raoyongming/DynamicViT is primarily written in Jupyter Notebook.
How popular is DynamicViT?: raoyongming/DynamicViT has 668 stars on GitHub.
Where can I find DynamicViT?: raoyongming/DynamicViT is on GitHub at https://github.com/raoyongming/DynamicViT.