dorarad/gansformer
A transformer-based GAN architecture for high-resolution image and scene generation using bipartite attention and iterative latent variable refinement.

GANformer implements a novel generative adversarial transformer designed specifically for image generation tasks. The model employs a bipartite structure that enables long-range interactions across images while maintaining linear computational efficiency, allowing it to scale to high-resolution synthesis. Information propagates iteratively between latent variables and visual features, supporting mutual refinement and encouraging compositional representations of objects and scenes. Unlike standard transformers, it uses multiplicative integration for flexible region-based modulation, generalizing classic transformer mechanisms for visual generation.