← all repositories

lucidrains/audiolm-pytorch

A PyTorch implementation of Google's AudioLM for neural audio generation and synthesis.

audiolm-pytorch
Velocity · 7d
+1.9
★ / day
Trend
steady
star history

This repository provides a PyTorch implementation of AudioLM, Google’s state-of-the-art language modeling approach to audio generation. It leverages transformer architectures with attention mechanisms to model audio waveforms directly. The implementation extends AudioLM with classifier-free guidance using T5, enabling text-to-audio and text-to-speech capabilities similar to VALL-E. It also includes MIT-licensed SoundStream components and is compatible with Facebook’s EnCodec for neural audio compression.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.