← all repositories
facebookresearch/audiocraft

Meta's audio lab: a full stack for AI-generated sound

A PyTorch toolkit that bundles neural audio codecs, text-to-music models, and training pipelines for researchers who want to generate or compress audio with deep learning.

23.4k stars Jupyter Notebook Image · Video · AudioML Frameworks
audiocraft
Velocity · 7d
+21
★ / day
Trend
steady
star history

What it does AudioCraft is Meta’s PyTorch library for audio generation research. It packages inference and training code for several models: MusicGen and JASCO for text-to-music, AudioGen for text-to-sound effects, EnCodec as a neural audio codec, plus Multi Band Diffusion, MAGNeT, and AudioSeal for watermarking. Everything is wired together with shared components and configurable training pipelines.

The interesting bit The library treats audio generation as a stack rather than a grab bag. EnCodec compresses audio to discrete tokens; MusicGen and friends generate those tokens from text (or chords, or melodies); Multi Band Diffusion decodes them back to waveforms. You can use the pieces separately or retrain the whole pipeline. The README notes training code is available for EnCodec, MusicGen, Multi Band Diffusion, and JASCO specifically.

Key highlights

  • MusicGen: text-to-music with optional melodic conditioning (hum a tune, get an arrangement)
  • JASCO: adds chord, melody, and drum track conditioning for finer control
  • EnCodec: neural codec at the center, tokenizing audio for generation models
  • AudioSeal: built-in watermarking for generated audio
  • Training pipelines: not just inference; includes configs and grids for reproducing papers

Caveats

  • Model weights are CC-BY-NC 4.0 (non-commercial), while the code itself is MIT — check your use case
  • Requires Python 3.9 and PyTorch 2.1.0 exactly; the install instructions warn about xformers compatibility
  • ffmpeg dependency, with a specific <5 constraint if using conda

Verdict Worth a look if you’re doing research in neural audio generation or need a controllable music LM to build on. Skip it if you want a simple API for casual music generation — the value here is in the training code and composable pieces, not a polished end-user product.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.