Text-to-Audio/Make-An-Audio
A conditional diffusion probabilistic model that generates high-fidelity audio from text and other modality inputs.

Velocity · 7d
+0.6
★ / day
Trend
→steady
star history
Make-An-Audio is a generative AI system that produces audio from text prompts using latent diffusion models. It employs a prompt-enhanced diffusion approach for conditioning and supports generation from text and video modalities. The repository provides a PyTorch implementation along with pretrained models for audio generation and audio inpainting tasks.