A benchmark that makes neural nets solve fluid dynamics
PDEBench is a NeurIPS 2022 benchmark suite for scientific ML, shipping datasets and baseline implementations for PDEs from advection to Navier-Stokes.

What it does
PDEBench provides datasets, generation code, and training pipelines for benchmarking machine learning on partial differential equations. It covers forward and inverse problems across advection, Burgers, reaction-diffusion, Darcy flow, shallow water, and compressible/incompressible Navier-Stokes. The repository includes ready-to-use HDF5 datasets, download scripts, and baseline implementations of FNO, U-Net, and PINN models with evaluation metrics and plotting.
The interesting bit
The project treats benchmark maintenance as ongoing infrastructure, not a one-off paper artifact. Authors explicitly invite community extensions, and the codebase is modular enough that adding a new PDE means dropping simulation code into a Hydra-configured directory and running a generation script. JAX simulation code runs roughly 6× faster than PyTorch equivalents in their tests, which matters when you’re generating training data for turbulent flows.
Key highlights
- Datasets and pretrained models are archived with DOIs via the University of Stuttgart’s Dataverse
- Baseline models include Fourier Neural Operator, U-Net, and Physics-Informed Neural Networks through DeepXDE
- Data generation supports both NumPy-output and direct HDF5 merge workflows
- Forward and inverse problem training pipelines with standardized metrics and CSV export for comparison
- PyPI installable:
pip install pdebench
Caveats
- Official targets are dated: Python 3.9, PyTorch 1.13.0, CUDA 11.7; newer versions are partially verified but “still under investigation” for some components
- Hydra modifies
data_pathat runtime in FNO and U-Net training scripts, requiring manual uncommenting ofto_absolute_pathlines as a workaround - Data generation for some PDEs requires merging NumPy arrays to HDF5 before the dataloaders can read them
Verdict
Researchers building or evaluating neural operators for physical simulation should start here rather than rolling their own ad-hoc comparisons. Casual practitioners may find the multi-step data generation and version compatibility notes more friction than they want for a quick experiment.