alphacep/vosk-api
An offline open-source speech recognition toolkit using deep neural networks for continuous large vocabulary transcription and speaker identification.

Vosk is an offline speech recognition API supporting Android, iOS, Raspberry Pi, and server deployments. It provides continuous large vocabulary transcription with zero-latency streaming, speaker identification, and supports 20+ languages and dialects. The toolkit includes pre-trained models (50 Mb) and bindings for Python, Java, Node.js, C#, C++, Rust, and Go, built on Kaldi and deep neural network architectures.