Is speech-language-processing open source?

Yes — edobashira/speech-language-processing is an open-source project tracked on heatdrop.

How popular is speech-language-processing?

edobashira/speech-language-processing has 2.2k stars on GitHub.

Where can I find speech-language-processing?

edobashira/speech-language-processing is on GitHub at https://github.com/edobashira/speech-language-processing.

edobashira/speech-language-processing

A 2,200-star map to the speech/NL tooling wilderness

A curated list that catalogs finite-state transducers, language models, and speech recognizers so you don't have to hunt them down yourself.

★2.2k stars Learning Language Models

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does This repo is a manually maintained index of open-source tools and datasets across speech and natural language processing. Categories include finite-state toolkits, language modeling libraries, speech recognizers, signal processing utilities, text-to-speech systems, speech corpora, and machine translation frameworks. Each entry gets a one-line description and a link.

The interesting bit The list leans hard into the classical, toolkit-heavy side of NLP—weighted finite-state transducers, Hidden Markov Models, and n-gram smoothing—making it a useful time capsule of the pre-transformer toolchain. The maintainer’s personal favorites (“my personal favourite LM toolkit”) and occasional dead links give it the flavor of an actual researcher’s bookmarks folder rather than SEO content.

Key highlights

Covers niche tooling rarely aggregated elsewhere: OpenFst wrappers, WFST decoders, Pitman-Yor process libraries, segmental CRF toolkits
Includes hard-to-find speech datasets (LibriSpeech, TED-LIUM, CMUdict) alongside software
Entries span multiple decades and maintenance statuses, from actively developed (Kaldi) to explicitly unmaintained (MIT FST Toolkit)
Sub-categories are alphabetized, which helps browsing but doesn’t prioritize by relevance or freshness

Caveats

Several links point to defunct hosting (Google Code, raw .zip files on personal sites) with no archival fallback noted
No clear criteria for inclusion or deprecation; some descriptions are copied from project homepages without verification
Machine Translation section is truncated in the source, so coverage there is incomplete

Verdict Worth bookmarking if you’re maintaining legacy speech pipelines, researching historical NLP approaches, or need a starting point for comparing finite-state libraries. Skip it if you want modern neural-only stacks or actively curated, annotated guidance.

Frequently asked

What is edobashira/speech-language-processing?: A curated list that catalogs finite-state transducers, language models, and speech recognizers so you don't have to hunt them down yourself.
Is speech-language-processing open source?: Yes — edobashira/speech-language-processing is an open-source project tracked on heatdrop.
How popular is speech-language-processing?: edobashira/speech-language-processing has 2.2k stars on GitHub.
Where can I find speech-language-processing?: edobashira/speech-language-processing is on GitHub at https://github.com/edobashira/speech-language-processing.