Painting over ImageNet to teach CNNs some geometry
A dataset builder that swaps textures for art styles while keeping shapes intact, forcing models to look at structure instead of surface patterns.

What it does Stylized-ImageNet takes standard ImageNet photos and runs them through AdaIN style transfer using ~38GB of paintings from Kaggle’s painter-by-numbers dataset. The result: same objects, same poses, but textures scrambled into impressionist chaos. It’s a drop-in ImageNet replacement that costs 134GB of disk space and a GPU run through a provided shell script.
The interesting bit The core finding is that standard CNNs are “texture junkies”—they classify by surface patterns, not object shape. By training on images where local texture is meaningless noise, the model has no choice but to attend to global geometry. The authors showed this shape bias actually improves accuracy and robustness, not just interpretability.
Key highlights
- Built on Naoto Inoue’s pytorch-AdaIN implementation; the repo bundles preprocessing, style transfer, and pipeline glue in one place
- Docker image provided (tested on
bethgelab/deeplearning:cuda9.0-cudnn7) to avoid dependency hell - Pre-trained models available in separate
texture-vs-shaperepository - For non-ImageNet datasets, authors point to
bethgelab/stylize-datasetsas the more general successor - ICLR 2019 Oral; human comparison data and evaluation toolbox now live in
bethgelab/model-vs-human
Caveats
- No direct dataset download available; you must build it yourself or find a colleague who already has
- CUDA 9.0 / cuDNN 7 Docker tag dates this somewhat; newer GPU setups may need massaging
- The actual paper code, data, and materials live in a different repository (
texture-vs-shape), not here
Verdict Worth your time if you’re probing CNN failure modes, studying out-of-distribution robustness, or need a principled way to disentangle shape and texture representations. Skip if you just want a quick augmented dataset—this is research infrastructure, not a data loader one-liner.