Is voice-pro open source?

Yes — abus-aikorea/voice-pro is open source, released under the GPL-3.0 license.

What language is voice-pro written in?

abus-aikorea/voice-pro is primarily written in Python.

How popular is voice-pro?

abus-aikorea/voice-pro has 11.2k stars on GitHub and is currently holding steady.

Where can I find voice-pro?

abus-aikorea/voice-pro is on GitHub at https://github.com/abus-aikorea/voice-pro.

← all repositories

abus-aikorea/voice-pro

A self-hosted dubbing studio that glues together every open-source voice model

Voice-Pro bundles Whisper, F5-TTS, CosyVoice, and a dozen other tools into a single Gradio interface for creators who want ElevenLabs-like results without the API bills.

★11.2k stars Python Inference · Serving Image · Video · Audio

View on GitHub ↗ Homepage ↗

Velocity · 7d

+6.6

★ / day

Trend

→steady

star history

What it does Voice-Pro is a Gradio-based web application that chains together audio processing tasks end-to-end: download a YouTube video, isolate vocals with Demucs, transcribe with Whisper variants, translate via Deep-Translator, and synthesize speech using Edge-TTS, kokoro, or zero-shot voice cloning through F5-TTS, E2-TTS, and CosyVoice. The target audience is podcasters, video creators, and developers building multilingual content pipelines.

The interesting bit The project is essentially a curated integration layer — a kitchen-sink approach to open-source speech AI. Rather than inventing new models, it wires together a dozen specialized tools (WhisperX for alignment, spaCy for sentence segmentation, yt-dlp for ingestion) and exposes them through a single browser interface. The maintainers have also assembled a small zoo of fine-tuned F5-TTS checkpoints for languages including Finnish, Hindi, and Japanese.

Key highlights

Supports four Whisper variants for transcription, including timestamped and speaker-diarized outputs via WhisperX
Zero-shot voice cloning through three different backends: F5-TTS, E2-TTS, and CosyVoice
Built-in YouTube downloading and Demucs vocal separation, so the pipeline starts from a URL rather than a clean audio file
Claims Windows + NVIDIA GPU as the verified platform; Mac and Linux support is mentioned but explicitly unverified
Now fully open-sourced under LGPL after the maintainers shifted focus to a separate commercial project (WeConnect)

Caveats

Active development has stopped; the README states updates are “not possible for the time being”
Initial setup downloads a 9GB CosyVoice model, which the README warns may take over an hour
The free version appears limited to 60-second media clips based on version history notes

Verdict Worth a look if you need a local, all-in-one dubbing workstation and don’t mind some assembly friction. Skip it if you want a polished SaaS experience or need reliable cross-platform support — this is a Windows-first project with maintenance on indefinite hold.

Frequently asked

What is abus-aikorea/voice-pro?: Voice-Pro bundles Whisper, F5-TTS, CosyVoice, and a dozen other tools into a single Gradio interface for creators who want ElevenLabs-like results without the API bills.
Is voice-pro open source?: Yes — abus-aikorea/voice-pro is open source, released under the GPL-3.0 license.
What language is voice-pro written in?: abus-aikorea/voice-pro is primarily written in Python.
How popular is voice-pro?: abus-aikorea/voice-pro has 11.2k stars on GitHub and is currently holding steady.
Where can I find voice-pro?: abus-aikorea/voice-pro is on GitHub at https://github.com/abus-aikorea/voice-pro.