← all repositories

lucidrains/magvit2-pytorch

PyTorch implementation of MagViT2, a state-of-the-art video tokenizer for visual generation models.

magvit2-pytorch
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

This repository provides a PyTorch implementation of the MagViT2 tokenizer from the paper ‘Language Model Beats Diffusion - Tokenizer is Key to Visual Generation’. The tokenizer converts video frames into discrete tokens using a Lookup Free Quantizer and transformer-based architecture, enabling efficient video understanding and generation. It supports configurable image sizes, codebook sizes, and layer structures for training custom video tokenizers.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.