jhj0517/Whisper-WebUI
A browser interface for running OpenAI's Whisper model to generate subtitles from audio files, YouTube, and microphone input.

Velocity · 7d
+2.4
★ / day
Trend
→steady
star history
This project provides a Gradio-based web interface for the Whisper speech-to-text model, enabling subtitle generation from various audio sources. It supports multiple Whisper implementations including faster-whisper and insanely-fast-whisper for optimized performance. The pipeline integrates Silero VAD for voice activity detection, pyannote for speaker diarization, and offers translation features using Facebook NLLB models and DeepL API.