Is Speech-Backbones open source?

Yes — huawei-noah/Speech-Backbones is an open-source project tracked on heatdrop.

What language is Speech-Backbones written in?

huawei-noah/Speech-Backbones is primarily written in Jupyter Notebook.

How popular is Speech-Backbones?

huawei-noah/Speech-Backbones has 604 stars on GitHub.

Where can I find Speech-Backbones?

huawei-noah/Speech-Backbones is on GitHub at https://github.com/huawei-noah/Speech-Backbones.

← all repositories

huawei-noah/Speech-Backbones

Huawei's speech lab dumps three diffusion papers in one repo

A single landing pad for Grad-TTS, SPIRAL, and DiffVC — because maintaining separate repos is apparently harder than probabilistic diffusion modeling.

★604 stars Jupyter Notebook Image · Video · Audio Inference · Serving

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

This is Huawei Noah’s Ark Lab’s monorepo for speech research code. It currently houses three projects: Grad-TTS (a text-to-speech system using diffusion probabilistic models), SPIRAL (a self-supervised speech pre-training method), and DiffVC (a diffusion-based voice converter). Each comes with its own paper and author list, but they share the same README real estate.

The interesting bit

The diffusion obsession is notable — two of the three projects use diffusion models for generative speech tasks, which suggests the lab went deep on that particular wave before the rest of the field pivoted to flow matching. The SPIRAL paper’s angle is more unusual: it learns representations invariant to artificial perturbations, a kind of “what doesn’t kill the spectrogram makes it stronger” approach to pre-training.

Key highlights

Grad-TTS: ICML 2021, diffusion-based TTS with probabilistic modeling
SPIRAL: ICLR 2022, self-supervised learning with perturbation invariance
DiffVC: ICLR 2022 Oral, voice conversion with “fast maximum likelihood sampling”
All three include official implementations and arXiv links
Jupyter Notebook is the listed language, suggesting notebook-heavy demos or training workflows

Caveats

The README is essentially a table of contents with paper links; no installation, usage, or model weights are visible
No candidate images provided, and the repo itself appears to have no screenshots or architecture diagrams in the README
It’s unclear whether these are actively maintained or snapshot releases for paper reproducibility

Verdict

Worth a bookmark if you’re tracing the diffusion-for-speech lineage or need official Grad-TTS/DiffVC baselines for comparison. Skip it if you want turnkey training scripts or a unified framework — this is a paper-code drop, not a product.

Frequently asked

What is huawei-noah/Speech-Backbones?: A single landing pad for Grad-TTS, SPIRAL, and DiffVC — because maintaining separate repos is apparently harder than probabilistic diffusion modeling.
Is Speech-Backbones open source?: Yes — huawei-noah/Speech-Backbones is an open-source project tracked on heatdrop.
What language is Speech-Backbones written in?: huawei-noah/Speech-Backbones is primarily written in Jupyter Notebook.
How popular is Speech-Backbones?: huawei-noah/Speech-Backbones has 604 stars on GitHub.
Where can I find Speech-Backbones?: huawei-noah/Speech-Backbones is on GitHub at https://github.com/huawei-noah/Speech-Backbones.