Speech recognition that stays off the server
Vosk-browser runs Kaldi-based ASR in a WebWorker via WebAssembly, so your voice data never leaves the tab.

What it does
Vosk-browser wraps the Vosk speech recognition engine in a WebAssembly build purpose-built for browser WebWorkers. You load a model (a .tar.gz file), wire it to microphone input through the Web Audio API, and get text back — all without a network round-trip to a speech API.
The interesting bit
The heavy lifting isn’t new; Denis Treskunov did the original WebAssembly port of Kaldi. This project packages an updated Vosk build and, crucially, makes the browser integration ergonomic. The WebWorker constraint matters: it keeps the main thread free while the recognizer chews on audio, which is the only sane way to do real-time ASR in a browser without jank.
Key highlights
- 13 languages supported in the live demo
- CDN loadable via jsdelivr as a global
Voskobject - API surface is small: create model, instantiate
KaldiRecognizer, feed itAudioBufferchunks viaacceptWaveform() - Explicitly browser-only; NodeJS users are directed to official bindings
- Examples and API docs live in
./lib/README.mdand./examples/
Caveats
- No tests yet (listed in Todos)
- Model files are your problem to host and serve; no cloud model delivery
- Speaker identification models are not yet demonstrated
Verdict
Worth a look if you need offline, privacy-preserving speech-to-text in a web app and can stomach shipping multi-megabyte model files. If you want turnkey cloud accuracy or NodeJS server-side ASR, this isn’t your tool.