lucidrains/TimeSformer-pytorch
A PyTorch implementation of TimeSformer, an attention-based deep learning model for video classification.

Velocity · 7d
+0.4
★ / day
Trend
→steady
star history
This repository implements TimeSformer from Facebook AI, a pure attention-based architecture for video understanding. It uses divided space-time attention, applying attention along the time axis before spatial attention. The model processes video tensors of shape (batch x frames x channels x height x width) and outputs classification predictions. It is built with PyTorch and supports variable-length video batching via masking.