← all repositories
sooftware/kospeech

KoSpeech: A Korean ASR toolkit that admits defeat gracefully

An archived PyTorch library for Korean speech recognition that now redirects users to better-maintained alternatives.

kospeech
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

KoSpeech is a PyTorch-based toolkit for training end-to-end Korean automatic speech recognition models. It bundles implementations of seven architectures—Deep Speech 2, LAS, RNN-Transducer, Speech Transformer, Jasper, Conformer, and Joint CTC-Attention variants—along with preprocessing pipelines for the KsponSpeech corpus (and LibriSpeech). Configuration is handled through Hydra, which keeps the CLI commands tidy.

The interesting bit

The author archived the repo and posted a refreshingly honest redirect: want to train your own model? Go to OpenSpeech. Want inference right now? Use Pororo ASR or Whisper. This kind of maintenance realism is rare in open-source ASR, where abandoned toolkits usually just rot in silence.

Key highlights

  • Seven model architectures in one codebase, from 2015-era Deep Speech 2 to 2020 Conformer
  • Hydra-based configuration keeps training commands consistent across models
  • Includes preprocessing for KsponSpeech, a 1,000-hour Korean corpus without established baselines
  • Published in Software Impacts (Elsevier), so there’s a citable paper
  • Apache 2.0 licensed

Caveats

  • Archived and unmaintained — the author explicitly recommends other projects
  • Subword and grapheme units are noted as “currently not tested”
  • The author admits a large refactor went largely untested due to time constraints

Verdict

Worth a look if you’re studying Korean ASR history or need to reproduce the paper’s specific KsponSpeech baselines. Everyone else should follow the author’s own advice and head to OpenSpeech, Pororo, or Whisper.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.