← all repositories
vincentherrmann/pytorch-wavenet

WaveNet in PyTorch, minus the molasses

A research-grade audio synthesis model that actually ships with fast generation and a working demo notebook.

1k stars Jupyter Notebook Image · Video · Audio
pytorch-wavenet
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

Implements DeepMind’s WaveNet architecture for raw audio generation, with a practical twist: it includes the fast generation algorithm that makes inference roughly feasible. The repo also handles the grunt work of building datasets from loose audio files and piping everything into TensorBoard.

The interesting bit

Most open-source WaveNet implementations stop at “here’s the model, good luck.” This one bundles the follow-up paper’s speedup trick, multithreaded data loading, and a Jupyter demo that walks you through training. It’s pitched as a starting point for experiments rather than a black box.

Key highlights

  • Fast generation via the 2016 Parallel WaveNet speedup paper
  • Auto-assembles train/validation splits from .wav, .aiff, and .mp3 files in a directory
  • TensorBoard logging with parameter histograms and generated audio samples
  • Includes a demo notebook and pre-generated audio clips to sanity-check results

Caveats

  • Locked to PyTorch 0.3, which is several years and breaking changes behind current releases
  • Requires TensorFlow installed solely for TensorBoard logging (a minor dependency indignity)
  • README is sparse on training details, hyperparameters, or expected hardware requirements

Verdict

Worth a look if you’re studying generative audio models or need a hackable WaveNet baseline. Skip it if you want production-ready code or a modern PyTorch stack; you’ll spend more time porting than training.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.