← all repositories

NVIDIA/audio-flamingo

NVIDIA's Audio Flamingo is a series of open-source multimodal LLMs that understand speech and music through natural language interactions.

audio-flamingo
Velocity · 7d
+1.5
★ / day
Trend
steady
star history

Audio Flamingo provides PyTorch implementations of large language models trained to understand audio through text queries. The models support audio captioning, question answering, reasoning, and long-audio understanding across speech and music domains. Multiple versions have been published at top ML venues (ICML, NeurIPS), with the latest being fully open-sourced for research use.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.