← all repositories
alphacep/vosk-server

Speech recognition that doesn't phone home

An offline server that wraps Vosk/Kaldi in four protocols so you can pick your poison.

vosk-server
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does

vosk-server is a thin server layer around the Vosk-API and Kaldi speech recognition engines. It exposes the same offline ASR backend through four protocols: WebSocket, gRPC, WebRTC, and MQTT. You run it locally or on a server, point your client at it, and get transcripts without shipping audio to cloud APIs.

The interesting bit

The project is essentially protocol glue — but useful glue. The WebRTC path is the unusual one; most open-source ASR tools stop at HTTP or gRPC, leaving browser-based real-time audio as an exercise for the reader. Here it’s built in, which matters for telephony and web chatbots where latency stings.

Key highlights

  • Four protocol servers in one repo: WebSocket, gRPC, WebRTC, MQTT
  • Targets specific integration patterns: smart home, PBX (FreeSWITCH, Asterisk), web backends, chatbots
  • Fully offline — runs the Vosk/Kaldi stack locally, no external API calls
  • Docker-based deployment; docs live on the separate Vosk website

Caveats

  • The README is sparse; actual setup instructions are off-repo at alphacephei.com/vosk/server
  • No benchmarks, model sizes, or hardware requirements listed in the repository itself

Verdict

Worth a look if you’re building voice features into a web app or phone system and would rather not feed Google or AWS your audio stream. Skip it if you need managed scaling, detailed telemetry, or extensive in-repo documentation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.