← all repositories

FunAudioLLM/Fun-ASR

An end-to-end speech recognition model supporting 31 languages with real-time transcription, speaker diarization, and multi-accent recognition.

1.2k stars Python Image · Video · Audio
Fun-ASR
Velocity · 7d
+7.0
★ / day
Trend
steady
star history

Fun-ASR is an automatic speech recognition (ASR) system developed by Tongyi Lab that converts speech audio to text across 31 languages and dialects. It is trained on tens of millions of hours of real speech data and supports features including real-time low-latency transcription, speaker diarization, hotword enhancement, and timestamp output. The model is available on ModelScope and Hugging Face as Fun-ASR-Nano.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.