Is SpecAugment open source?

Yes — DemisEom/SpecAugment is open source, released under the Apache-2.0 license.

What language is SpecAugment written in?

DemisEom/SpecAugment is primarily written in Python.

How popular is SpecAugment?

DemisEom/SpecAugment has 655 stars on GitHub.

Where can I find SpecAugment?

DemisEom/SpecAugment is on GitHub at https://github.com/DemisEom/SpecAugment.

← all repositories

DemisEom/SpecAugment

Google Brain's spectrogram trick, copy-pasted into PyTorch and TF

A straightforward port of SpecAugment for developers who want to warp and mask mel spectrograms without reading the paper.

★655 stars Python Data Tooling

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Takes a mel spectrogram and applies three augmentations in sequence: time warping, frequency masking, and time masking. The result is a distorted spectrogram you feed into a speech recognition model instead of the original. Supports both TensorFlow and PyTorch, though the interface is just two separate import paths.

The interesting bit

The original SpecAugment paper showed you could get state-of-the-art ASR results by augmenting spectrograms directly — no fancy audio domain tricks, no speed perturbation on raw waveforms. This repo is a literal implementation of that idea: warp, mask, done. The test code runs against LibriSpeech, so you can verify it actually produces the expected visual artifacts.

Key highlights

Dual backend support: spec_augment_tensorflow or spec_augment_pytorch
pip3 install SpecAugment — one-liner install
Includes before/after spectrogram images in the README
Apache 2.0 licensed
Test script provided with LibriSpeech example

Caveats

The README images are hotlinked from a different fork (shelling203/SpecAugment), not this repo — links may rot
No version pinning or dependency list shown; “some audio libraries work properly” is the full guidance
654 stars but sparse recent activity; this is a reference implementation, not a maintained package

Verdict

Grab this if you need a quick, working SpecAugment for a Kaggle competition or research baseline. Skip it if you want production-hardened augmentation — look at torchaudio transforms or nlpaug instead, which have actual maintainers.

Frequently asked

What is DemisEom/SpecAugment?: A straightforward port of SpecAugment for developers who want to warp and mask mel spectrograms without reading the paper.
Is SpecAugment open source?: Yes — DemisEom/SpecAugment is open source, released under the Apache-2.0 license.
What language is SpecAugment written in?: DemisEom/SpecAugment is primarily written in Python.
How popular is SpecAugment?: DemisEom/SpecAugment has 655 stars on GitHub.
Where can I find SpecAugment?: DemisEom/SpecAugment is on GitHub at https://github.com/DemisEom/SpecAugment.