pyannote/pyannote-audio
Neural network toolkit for speaker diarization providing pretrained models and pipelines for identifying who spoke when in audio recordings.

Velocity · 7d
+2.7
★ / day
Trend
→steady
star history
pyannote.audio is an open-source Python toolkit for speaker diarization built on PyTorch. It provides neural building blocks for speech activity detection, speaker change detection, overlapped speech detection, and speaker embedding extraction. The project offers pretrained pipelines and models on Hugging Face, supports multi-GPU training via PyTorch Lightning, and can be fine-tuned on custom data for improved performance.