Is wavegan open source?

Yes — chrisdonahue/wavegan is open source, released under the MIT license.

What language is wavegan written in?

chrisdonahue/wavegan is primarily written in Python.

How popular is wavegan?

chrisdonahue/wavegan has 1.4k stars on GitHub.

Where can I find wavegan?

chrisdonahue/wavegan is on GitHub at https://github.com/chrisdonahue/wavegan.

← all repositories

chrisdonahue/wavegan

GANs that learn to sing, drum, and chirp in raw waveforms

WaveGAN treats audio generation like image synthesis, but skips the spectrogram middleman and works directly on one-dimensional waveforms.

★1.4k stars Python Image · Video · Audio

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does WaveGAN trains generative adversarial networks to synthesize raw audio waveforms—speech, drums, birdsong, piano—by learning from folders of MP3s, WAVs, or OGGs without preprocessing. The repo includes both the waveform approach and a SpecGAN baseline that generates spectrograms instead. You can train on arbitrary audio files, adjust sample rates, handle multi-channel audio, and generate clips up to 4 seconds at 16kHz.

The interesting bit Most audio GANs operate on 2D spectrograms because image-generation techniques transfer cleanly. WaveGAN keeps everything in the time domain, using one-dimensional convolutions analogous to DCGAN’s image approach. The v2 update added streaming data loading and longer generation lengths in response to what the authors call “common requests”—a polite way of saying the first version was a bit rough for real use.

Key highlights

Train directly on audio files; no preprocessing required (v2 improvement)
Supports variable sample rates, multi-channel audio, and clip lengths up to 65,536 samples
Includes JavaScript generation code for browser-based demos (see the procedural drum machine)
TensorBoard monitoring, automatic checkpoint backups, and Inception score evaluation for SC09
Colab notebook provided for inference

Caveats

Locked to TensorFlow 1.12.0 and Python 3; this is legacy code now
Single-GPU training only; CPU training is “prohibitively slow and not recommended”
GAN training “may occasionally collapse”—hence the hourly backup script
SpecGAN comparison is limited to 1-second clips at 16kHz

Verdict Worth exploring if you’re researching generative audio or need a reproducible baseline for raw-waveform synthesis. Skip it if you want modern PyTorch, stable diffusion-based audio, or anything approaching production reliability.

Frequently asked

What is chrisdonahue/wavegan?: WaveGAN treats audio generation like image synthesis, but skips the spectrogram middleman and works directly on one-dimensional waveforms.
Is wavegan open source?: Yes — chrisdonahue/wavegan is open source, released under the MIT license.
What language is wavegan written in?: chrisdonahue/wavegan is primarily written in Python.
How popular is wavegan?: chrisdonahue/wavegan has 1.4k stars on GitHub.
Where can I find wavegan?: chrisdonahue/wavegan is on GitHub at https://github.com/chrisdonahue/wavegan.