heatdrop.ai

The hottest AI & LLM repositories on GitHub — measured, ranked, and explained.

← all repositories

syl22-00/pocketsphinx.js

Speech recognition that never phones home

PocketSphinx.js compiles a C speech recognizer to WebAssembly so your browser can transcribe audio without sending it to anyone else's server.

★1.5k stars JavaScript Image · Video · Audio

View on GitHub ↗

pocketsphinx.js

Velocity · 7d

+0.3

★ / day

Trend

→steady

star history

What it does PocketSphinx.js is a browser-based speech recognizer built by compiling the C library PocketSphinx to JavaScript or WebAssembly via Emscripten. It includes an audio recorder using the Web Audio API, a Web Worker wrapper to keep recognition off the UI thread, and a callback utility for cleaner worker communication. The whole pipeline runs locally—no cloud APIs, no network latency, no data leaving the machine.

The interesting bit The project treats the browser as a full compilation target, not just a JavaScript runtime. You can embed acoustic models, language models, and dictionaries directly into the build output via CMake flags, or split them out to avoid a multi-megabyte initial download. There’s even a separate Chinese demo and keyword spotting support for wake-word-style detection.

Key highlights

Compiles to either asm.js or WebAssembly; WebAssembly build requires correct MIME type serving (application/wasm)
recognizer.js wraps the heavy lifting in a Web Worker so the main thread stays responsive
audioRecorder.js handles sample-rate conversion and can be reused for non-speech audio applications
Supports custom acoustic models, statistical language models, and dictionaries at build time or runtime
Includes live demos for English, Chinese, and keyword spotting

Caveats

The compiled output is “a few MB” and loads synchronously, so the Web Worker wrapper is essentially mandatory for production use
Build process requires Emscripten, CMake, and careful submodule initialization; Windows users get sent to the Emscripten manual
README warns that you must serve over HTTPS or localhost for audio recording to work, and suggests running Chrome with --disable-web-security for local testing—a security footgun if misunderstood

Verdict Worth a look if you need offline speech recognition in a web app and can tolerate the complexity of shipping your own models. Skip it if you want plug-and-play accuracy or modern neural-network-based recognition; this is classic HMM-GMM speech recognition, not Whisper.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.