zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Curated collection of academic papers on automatic speech recognition, text-to-speech synthesis, voice conversion, and related audio generation techniques.

This repository aggregates research papers on speech and audio processing including automatic speech recognition (ASR), speaker verification, voice conversion (VC), text-to-speech (TTS) synthesis, and text-to-audio generation. The collection spans classical approaches like hidden Markov models to modern deep learning methods using CNNs, RNNs, attention mechanisms, and diffusion models. It also includes papers on language modeling for speech tasks and music generation.