Is soprano open source?

Yes — ekwek1/soprano is open source, released under the Apache-2.0 license.

What language is soprano written in?

ekwek1/soprano is primarily written in Python.

How popular is soprano?

ekwek1/soprano has 1.2k stars on GitHub.

Where can I find soprano?

ekwek1/soprano is on GitHub at https://github.com/ekwek1/soprano.

← all repositories

ekwek1/soprano

A Sub-1GB TTS Model Claiming 2000× Real-Time Speed on a GPU

Soprano is an 80M-parameter, on-device TTS engine built for speed, streaming, and a sub-gigabyte memory footprint.

★1.2k stars Python Image · Video · Audio

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does This project synthesizes English speech from text using an 80-million-parameter model that stays under 1 GB of memory. It targets local inference across CUDA, CPU, and Apple Silicon, offering a WebUI, CLI, Python API, and an OpenAI-compatible HTTP endpoint. The model automatically splits long text for “infinite” generation and outputs audio at 32 kHz.

The interesting bit The project’s main pitch is raw speed: it claims up to 20× real-time synthesis on CPU and 2,000× on GPU, with streaming latency under 250 ms on CPU and under 15 ms on GPU. That kind of performance bragging is paired with an unusually honest admission that the model was trained on only ~1,000 hours of audio—roughly a hundredth of the data used by larger TTS systems—so it occasionally stumbles over uncommon words.

Key highlights

80M-parameter architecture with a sub-1 GB memory footprint
Claims 20× real-time generation on CPU and 2,000× on GPU
Lossless streaming with <250 ms CPU latency and <15 ms GPU latency
OpenAI-compatible server endpoint, plus WebUI, CLI, ComfyUI nodes, and ONNX export
Automatic text splitting for arbitrarily long inputs

Caveats

English-only, no voice cloning, and trained on just ~1,000 hours of audio—about a hundredth of typical TTS datasets—so expect occasional mispronunciations
The OpenAI-compatible server endpoint currently supports only non-streaming output
The CLI reloads the model on every invocation, and Windows CUDA setups need a manual PyTorch wheel swap to avoid a CPU-only install

Verdict Worth a look if you need a lightweight, offline English TTS for embedding in apps or local pipelines. Skip it for now if you need multilingual synthesis, voice cloning, or guaranteed accuracy on niche vocabulary.

Frequently asked

What is ekwek1/soprano?: Soprano is an 80M-parameter, on-device TTS engine built for speed, streaming, and a sub-gigabyte memory footprint.
Is soprano open source?: Yes — ekwek1/soprano is open source, released under the Apache-2.0 license.
What language is soprano written in?: ekwek1/soprano is primarily written in Python.
How popular is soprano?: ekwek1/soprano has 1.2k stars on GitHub.
Where can I find soprano?: ekwek1/soprano is on GitHub at https://github.com/ekwek1/soprano.