← all repositories

YuqingWang1029/VisTR

End-to-end video instance segmentation framework using transformer architecture.

756 stars Python Computer Vision
VisTR
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

VisTR implements an end-to-end approach to video instance segmentation by applying transformer architecture to jointly process video frames and predict instance masks across time. The model leverages a transformer-based detection framework (DETR) adapted for video understanding, enabling unified instance tracking and segmentation without additional post-processing. It is designed for video understanding tasks in computer vision research.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.