abus-aikorea/voice-pro
A Gradio-based web application providing AI-powered TTS, zero-shot voice cloning, and speech-to-text with multilingual dubbing and YouTube video processing.

This project bundles multiple open-source AI speech models into a single desktop application with a Gradio UI. It supports text-to-speech via Edge-TTS and kokoro, zero-shot voice cloning through E2-TTS, F5-TTS, and CosyVoice, and speech recognition using Faster-Whisper, WhisperX, and Whisper-Timestamped. Additional features include YouTube downloading, audio source separation via Demucs/MDX-Net, and multilingual translation for dubbing workflows.