ufal/whisper_streaming
A real-time streaming implementation of the Whisper speech recognition model for continuous speech-to-text transcription with 3.3 second latency.

Velocity · 7d
+3.1
★ / day
Trend
→steady
star history
This project extends OpenAI’s Whisper model to support real-time streaming transcription and translation. It implements a local agreement policy with self-adaptive latency to enable continuous inference on unsegmented long-form audio. The system was designed for live transcription services and demonstrated at a multilingual conference.