lucidrains/perceiver-pytorch
A PyTorch implementation of the Perceiver architecture, a transformer-based neural network using iterative cross-attention and latent self-attention for general perception tasks.

This repository provides a PyTorch implementation of the Perceiver model from Google DeepMind, a general-purpose architecture that processes arbitrary modality inputs using iterative attention. The model uses a small set of latent queries to attend over large input arrays, avoiding the quadratic complexity of standard self-attention. It supports configurable depth, attention heads, frequency encoding, and weight tying across layers for flexible image, video, or multimodal perception tasks.