snakers4/silero-vad
Pre-trained enterprise-grade Voice Activity Detector model for real-time speech detection in audio streams.

Velocity · 7d
+4.6
★ / day
Trend
→steady
star history
Silero VAD provides a pre-trained neural network model that detects voice activity in audio streams in real time. The model is exportable to ONNX format for cross-platform deployment via ONNX Runtime, and supports integration with speech-to-text pipelines. It offers multiple pretrained versions optimized for different use cases including real-time streaming and batch processing.