lucidrains/soundstorm-pytorch
PyTorch implementation of Google Deepmind's SoundStorm for parallel audio generation from vector quantized codes.

Velocity · 7d
+1.4
★ / day
Trend
→steady
star history
This repository provides a PyTorch implementation of SoundStorm, an efficient non-autoregressive audio generation model developed by Google Deepmind. It applies the MaskGiT approach to residual vector quantized codes from Soundstream/Encodec, using a Conformer transformer architecture. The model enables efficient parallel synthesis of high-quality audio from semantic token sequences.