jonatasgrosman/huggingsound
Python library for speech recognition using CTC and transformer models from Hugging Face.

Velocity · 7d
+0.3
★ / day
Trend
→steady
star history
A speech processing toolkit built on Hugging Face infrastructure that provides ready-to-use interfaces for automatic speech recognition tasks. It supports transcription with CTC models such as wav2vec2, character-level timestamps and probabilities, and language model decoding for improved accuracy. The library also includes speaker diarization, speech enhancement, and finetuning capabilities.