Is sherpa-onnx open source?

Yes — k2-fsa/sherpa-onnx is open source, released under the Apache-2.0 license.

What language is sherpa-onnx written in?

k2-fsa/sherpa-onnx is primarily written in C++.

How popular is sherpa-onnx?

k2-fsa/sherpa-onnx has 13.7k stars on GitHub and is currently holding steady.

Where can I find sherpa-onnx?

k2-fsa/sherpa-onnx is on GitHub at https://github.com/k2-fsa/sherpa-onnx.

← all repositories

k2-fsa/sherpa-onnx

Speech AI that runs on a $15 RISC-V board

A fully offline speech toolkit packing ASR, TTS, diarization, and VAD into one C++ runtime with ONNX, then wrapping it for twelve languages and every edge platform imaginable.

★13.7k stars C++ Image · Video · Audio

View on GitHub ↗ Homepage ↗

Velocity · 7d

+19

★ / day

Trend

→steady

star history

What it does

sherpa-onnx is a C++ inference engine for speech tasks: streaming and batch ASR, text-to-speech, speaker diarization/verification/identification, voice activity detection, audio tagging, keyword spotting, speech enhancement, and source separation. It runs entirely offline via ONNX Runtime, with bindings for C++, C, Python, Go, C#, Java, Kotlin, JavaScript, Swift, Rust, Dart, and even Object Pascal. WebAssembly builds let you try it in a browser without installing anything.

The interesting bit

The platform matrix reads like a hardware collector’s fever dream: x86, ARM32/64, RISC-V, Raspberry Pi, NVIDIA Jetson, Rockchip and Ascend NPUs, HarmonyOS, WearOS, and obscure boards like the LicheePi 4A and SpacemiT-K1. Most speech libraries pick a lane—cloud API, desktop Python, or embedded C. This one ships pre-built Android APKs, WebAssembly demos, and native bindings for all of the above, apparently from a single C++ core.

Key highlights

Runs fully offline; no network required for inference
Supports both streaming and non-streaming ASR, plus TTS with voice cloning
NPU acceleration for Rockchip RKNN, Qualcomm QNN, Huawei Ascend, and Axera
Hugging Face Spaces and WebAssembly demos for zero-install testing
Pre-built APKs for Android: speaker diarization, VAD+ASR, two-pass recognition, TTS
12 language bindings including the rarely-seen Pascal and Dart

Caveats

The README is a feature matrix and link farm; architecture diagrams, latency numbers, and model sizes are absent
“Supported” spans a vast surface area—actual performance on the more exotic boards (RISC-V, ARM32) is unclear
No clarity on how much glue code versus original inference engine; the project appears to aggregate models like Whisper, Silero VAD, Piper TTS, and Zipformer under one runtime

Verdict

Worth a look if you’re building voice features for hardware that lacks a GPU, a network connection, or both. Skip it if you need cloud-scale throughput or detailed benchmarking data to make platform decisions.

Frequently asked

What is k2-fsa/sherpa-onnx?: A fully offline speech toolkit packing ASR, TTS, diarization, and VAD into one C++ runtime with ONNX, then wrapping it for twelve languages and every edge platform imaginable.
Is sherpa-onnx open source?: Yes — k2-fsa/sherpa-onnx is open source, released under the Apache-2.0 license.
What language is sherpa-onnx written in?: k2-fsa/sherpa-onnx is primarily written in C++.
How popular is sherpa-onnx?: k2-fsa/sherpa-onnx has 13.7k stars on GitHub and is currently holding steady.
Where can I find sherpa-onnx?: k2-fsa/sherpa-onnx is on GitHub at https://github.com/k2-fsa/sherpa-onnx.