← all repositories

FunAudioLLM/SenseVoice

A multilingual speech understanding foundation model supporting ASR, emotion recognition, and audio event detection across 50+ languages.

SenseVoice
Velocity · 7d
+12
★ / day
Trend
steady
star history

SenseVoice is a speech foundation model providing automatic speech recognition, spoken language identification, speech emotion recognition, and audio event detection capabilities. It employs a non-autoregressive end-to-end architecture trained on over 400,000 hours of data to achieve low inference latency while supporting 50+ languages. The model is implemented in PyTorch and available through ModelScope and Hugging Face.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.