← all repositories

abus-aikorea/voice-pro

A Gradio-based web application providing AI-powered TTS, zero-shot voice cloning, and speech-to-text with multilingual dubbing and YouTube video processing.

voice-pro
Velocity · 7d
+16
★ / day
Trend
steady
star history

This project bundles multiple open-source AI speech models into a single desktop application with a Gradio UI. It supports text-to-speech via Edge-TTS and kokoro, zero-shot voice cloning through E2-TTS, F5-TTS, and CosyVoice, and speech recognition using Faster-Whisper, WhisperX, and Whisper-Timestamped. Additional features include YouTube downloading, audio source separation via Demucs/MDX-Net, and multilingual translation for dubbing workflows.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.