← all repositories
Justin-Tan/generative-compression

GANs that compress images by hallucinating the details back

A TensorFlow implementation of extreme learned image compression that trades exact reconstruction for tiny file sizes by letting a generator dream up the textures.

generative-compression
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

This repo implements Agustsson et al.’s method for squeezing images down to roughly 0.072 bits per pixel using a GAN. The encoder tosses most information into a narrow bottleneck (C=4 to 16 channels), then a decoder—fed with sampled noise—reconstructs something that looks plausible to human eyes but may swap greenery into buildings. It includes both global compression and a conditional variant that uses semantic label maps to guide reconstruction.

The interesting bit

The project openly leans into the hallucination problem: the README notes the decoder “hallucinates greenery in buildings, and vice-versa,” and the noise-sampling step explicitly increases this “hallucination factor.” Rather than fighting it, the method treats perceptual plausibility as the goal, not pixel-perfect fidelity. There’s also a frank admission that the authors’ paper leaves some implementation details unspecified, so the author made reasonable guesses.

Key highlights

  • Pretrained models provided for C=8 bottleneck (0.072 bpp) with both noise-sampled and conditional-GAN variants
  • Conditional compression uses semantic maps from Cityscapes gtFine annotations
  • Training scripts, single-image compression, and TensorBoard logging included
  • Hyperparameters live in config.py instead of a sprawling argparse block—opinionated but readable
  • Author also maintains a follow-up repo for higher-bitrate, higher-fidelity compression

Caveats

  • Stuck on TensorFlow 1.8; pretrained models were trained on TF 1.3 with a warning that they “appear to load without problems”
  • Batch size is hardcoded to 1, which will make training leisurely on modern hardware
  • The conditional GAN implementation is labeled “experimental” and yields the best visual results
  • Several planned features remain in the TODO list: VGG loss, WGAN-GP, spectral normalization, and selective compression

Verdict

Worth a look if you’re researching perceptual compression or need a concrete baseline for extreme bitrate regimes. Skip it if you need bit-exact reconstruction or a production-ready pipeline—this is research code with known rough edges, not a drop-in replacement for JPEG.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.