← all repositories

FrancescoSaverioZuppichini/ViT

A PyTorch implementation tutorial of Vision Transformer (ViT) for image recognition at scale.

ViT
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

This repository provides a complete implementation of the Vision Transformer (ViT) architecture in PyTorch. It breaks down the model block by block, covering patch embedding, positional encoding, transformer encoder layers with self-attention and residuals, and the classification head. The implementation is structured as an educational tutorial demonstrating how standard transformer mechanisms can be applied to image classification tasks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.