← all repositories
brain-research/self-attention-gan

GANs learn to look around: attention meets image generation

A research reproduction that adds self-attention mechanisms to GANs so the generator can reason globally instead of just stacking local patches.

self-attention-gan
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

This is a TensorFlow 1.5 implementation of the 2018 paper by Zhang et al. (including Ian Goodfellow), which bolts self-attention layers onto both the generator and discriminator of a GAN. The goal is sharper, more coherent ImageNet generation—think less “mushy textures in the wrong place” and more structurally consistent objects. You feed it TFRecord-preprocessed ImageNet data, wait several weeks for a million training steps on four GPUs, and hopefully get better samples out the other end.

The interesting bit

The twist is borrowing “non-local” attention from video understanding and jamming it into image synthesis. Instead of convolutions only seeing their immediate neighborhood, the generator can explicitly relate distant regions—useful for things like “the dog’s head should match the dog’s body.” The README also notes that bigger batches help, which quietly hints at the hardware arms race lurking inside most modern GAN research.

Key highlights

  • Reproduces the SAGAN paper with spectral normalization and projection discriminator included
  • Multi-GPU training script included (batch size 256 across 4 GPUs by default)
  • Evaluation script separate from training, so you can generate samples without restarting
  • Straightforward citation block and references to the three key prior works
  • ~1,018 stars suggests it was a useful reference implementation in the TF 1.x era

Caveats

  • Stuck on TensorFlow 1.5 and Python 3.6; this is archaeology now, not infrastructure
  • “Several weeks” of training on 4 GPUs is a serious compute commitment for a reproduction
  • README is minimal: no sample outputs, no pretrained weights, no hyperparameter guidance beyond “you might need to find new learning rates”

Verdict

Worth a look if you’re studying GAN architecture evolution or need to verify the original SAGAN results against a clean reimplementation. Skip it if you want something that runs on modern PyTorch/TF2 without a migration project.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.