← all repositories

whai362/PVT

A Pyramid Vision Transformer implementation providing backbone models for image classification, object detection, and semantic segmentation.

1.9k stars Python Computer VisionML Frameworks
PVT
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

This repository contains the official implementation of PVTv1 and PVTv2, transformer-based architectures designed as drop-in backbones for various vision tasks. The models achieve strong results on ImageNet-1K classification, COCO object detection, and semantic segmentation benchmarks. PVTv2 improves upon the original PVT and compares favorably to alternatives like Swin Transformer.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.