nari-labs/dia2
A streaming dialogue text-to-speech model that generates audio in real-time as text is provided.

Velocity · 7d
+5.6
★ / day
Trend
→steady
star history
Dia2 is an open-weight TTS model from Nari Labs with 1B and 2B parameter variants capable of real-time streaming audio generation. The model can begin producing speech as the first few words are provided, and supports conditioning on audio input to enable natural conversational dialogues. Inference is available via a CLI that supports CUDA graph optimization and bfloat16 precision, with model weights hosted on Hugging Face.