zhvng/open-musiclm
A PyTorch implementation of Google's MusicLM for generating music from text prompts.

This repository reproduces MusicLM, a state-of-the-art text-to-music generation model from Google Research. It replaces MuLan with CLAP for joint audio-text embedding, SoundStream with Meta’s Encodec for neural audio coding, and w2v-BERT with MERT for audio representation. The system uses transformer-based architectures with attention mechanisms to generate musical audio conditioned on text descriptions. Early audio samples are available demonstrating the model’s text-to-music generation capabilities.