facebookresearch/encodec
A neural audio codec using deep learning for high-fidelity audio compression at multiple bandwidths.

State-of-the-art neural audio codec presented in the High Fidelity Neural Audio Compression paper. Uses convolutional neural networks, LSTM encoders, and residual vector quantization to compress audio at various bitrates (1.5 to 24 kbps). Supports both causal 24 kHz monophonic audio and non-causal 48 kHz stereophonic music. Includes an entropy model based on a small transformer that can further compress representations by up to 40% without quality loss, along with a novel multiscale complex spectrogram discriminator for training.