pengzhiliang/Conformer
A hybrid CNN-Transformer architecture published at ICCV21 for visual recognition tasks.

Velocity · 7d
+0.3
★ / day
Trend
→steady
star history
Conformer combines convolutional neural networks and visual transformers to leverage both local feature extraction and global representation capture. The model uses a Feature Coupling Unit (FCU) to fuse local features with global representations interactively across different resolutions, maintaining a concurrent dual-branch structure. It serves as a general-purpose backbone for image classification, object detection, and instance segmentation tasks.