zihangJiang/TokenLabeling
PyTorch implementation of LV-ViT (Large-scale Vision Transformer) with token labeling for improved image classification and segmentation.

This repository implements the paper ‘All Tokens Matter: Token Labeling for Training Better Vision Transformers’. It provides training code and pre-trained models for LV-ViT, a vision transformer architecture trained using a token labeling approach that assigns labels to every token rather than just the CLS token. The implementation is based on the timm (pytorch-image-models) library and includes scripts for label data generation and a segmentation model variant.