microsoft/UniSpeech
A Microsoft repository providing pre-trained speech processing models (WavLM, UniSpeech, UniSpeech-SAT) trained via self-supervised learning for tasks including speaker verification, speech recognition, and speech separation.

UniSpeech is a family of speech processing models developed by Microsoft Research, each implementing self-supervised pre-training at scale. The models include WavLM for full-stack speech processing, UniSpeech for unified ASR pre-training, and UniSpeech-SAT which adds speaker awareness for improved speaker-related tasks. All models are implemented in PyTorch and released with pre-trained weights via HuggingFace. The repository provides evaluation benchmarks, inference code, and training pipelines for speech diarization, speaker verification, and speech recognition downstream tasks.