← all repositories

happyharrycn/actionformer_release

A Transformer-based deep learning model that detects and localizes actions in untrimmed video sequences.

563 stars Python Computer Vision
actionformer_release
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

ActionFormer implements a minimalist Transformer architecture for temporal action localization, classifying every moment in input videos and regressing action boundaries without relying on proposals or anchors. The model achieves state-of-the-art results on THUMOS14 (71.0% mAP at tIoU=0.5), ActivityNet 1.3 (36.56% average mAP), and EPIC-Kitchens (+13.5% over prior works). It was used in winning solutions for the Ego4D Moment Queries Challenge 2022, ranking 2nd with 21.76% average mAP and 42.54% Recall@1x.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.