czczup/ViT-Adapter
A Vision Transformer adapter module designed to enhance dense prediction tasks including object detection and semantic segmentation.

Velocity · 7d
+1.0
★ / day
Trend
→steady
star history
This project implements an adapter architecture that enhances standard Vision Transformers for dense prediction tasks such as object detection, semantic segmentation, and instance segmentation. Published at ICLR 2023, it achieves state-of-the-art results on benchmarks like COCO and ADE20K by introducing additional adapter modules that improve feature representation for pixel-level predictions.