double22a/speech_dataset
A curated list of Chinese and English speech recognition datasets with durations and download links.
★460 stars Data Tooling

Velocity · 7d
+0.2
★ / day
Trend
→steady
star history
This repository aggregates and documents publicly available speech datasets for automatic speech recognition research and development. It catalogs datasets across multiple languages including Mandarin Chinese and English, with metadata such as duration in hours and source URLs. The listed datasets include well-known resources like LibriSpeech, Common Voice, Aishell, and WenetSpeech, serving as a reference index for speech ML practitioners.