facebookresearch/svoice
A PyTorch implementation of a neural network model for separating a mixed audio signal into individual speaker voices.

Velocity · 7d
+0.7
★ / day
Trend
→steady
star history
This repository provides a PyTorch implementation of a research paper on voice separation, using gated neural networks to separate overlapping speech from multiple speakers in a single audio mixture. The model is trained for different numbers of possible speakers, with the largest model determining the actual speaker count in each sample. It uses 1D convolutions and recurrent blocks for temporal processing of audio signals.