facebookresearch/mae
A PyTorch implementation of Masked Autoencoders (MAE), a self-supervised vision transformer architecture for scalable image representation learning.

This repository provides a PyTorch re-implementation of the MAE paper on masked autoencoders for vision, originally released in TensorFlow+TPU. It includes pre-training code for self-supervised learning on images, fine-tuning scripts with pre-trained ViT checkpoints across Base/Large/Huge sizes, and an interactive visualization demo for exploring learned representations. The implementation builds on the timm library and the DeiT repository.