← all repositories

google-research/magvit

A masked generative video transformer that generates videos using a tokenizer and transformer architecture in JAX.

1k stars Python Image · Video · Audio
magvit
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

MAGVIT is a masked generative video transformer that generates videos by tokenizing video frames and using transformer-based masked modeling. It achieves state-of-the-art results across video generation and prediction benchmarks including UCF-101, Kinetics-600, and BAIR robot pushing. The official JAX implementation provides training and inference capabilities for the model.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.