Ma-Lab-Berkeley/CRATE
A PyTorch implementation of White-Box Transformers (CRATE) based on sparse rate reduction theory for segmentation and representation learning.

This repository contains the official PyTorch implementation of CRATE (Coding RAte reduction TransformEr), a research framework providing a theoretical understanding of transformer architectures through sparse rate reduction. It reinterprets standard transformer components like multi-head self-attention and feed-forward networks as implementing an information-theoretic objective, providing a white-box alternative to black-box deep learning. The work includes implementations of segmentation and masked autoencoding methods published at top ML venues.