Is allosaurus open source?

Yes — xinjli/allosaurus is open source, released under the GPL-3.0 license.

What language is allosaurus written in?

xinjli/allosaurus is primarily written in Python.

How popular is allosaurus?

xinjli/allosaurus has 737 stars on GitHub.

Where can I find allosaurus?

xinjli/allosaurus is on GitHub at https://github.com/xinjli/allosaurus.

← all repositories

xinjli/allosaurus

A speech recognizer that listens for sounds, not words

Allosaurus turns raw audio into IPA phone sequences for over 2000 languages, built for researchers who care about speech sounds more than word meanings.

★737 stars Python Image · Video · Audio Language Models

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does Allosaurus takes mono WAV audio and emits a sequence of IPA phones. It ships with a pretrained universal model covering roughly 2000 languages, plus an English-specific variant if you want tighter accuracy for that one language.

The interesting bit Instead of transcribing words, it transcribes the raw sounds—like a very patient phonetics professor in software form. The underlying model was trained on a multilingual allophone system, so it attempts to map acoustics to language-independent phones before narrowing to a specific inventory if you ask it to.

Key highlights

Supports around 2000 language inventories, though most rely on a single universal acoustic model
Outputs standard IPA symbols, with an optional timestamp approximation for each phone
Lets you customize the phone inventory by editing a plain text file of IPA symbols
Can emit top-k probable phones per frame if you want ambiguity explicitly spelled out
Runs inference on CPU by default; GPU optional

Caveats

Timestamp accuracy is explicitly noted as approximate because of the CTC architecture
Only one language-dependent model (English) is currently available; everything else falls back to the universal model
Expects mono-channel WAV input; anything else needs preprocessing

Verdict Grab it if you do phonetic research, low-resource language documentation, or speech analysis where word-level recognition isn’t the goal. Skip it if you need transcripts you can actually read at a dinner party.

Frequently asked

What is xinjli/allosaurus?: Allosaurus turns raw audio into IPA phone sequences for over 2000 languages, built for researchers who care about speech sounds more than word meanings.
Is allosaurus open source?: Yes — xinjli/allosaurus is open source, released under the GPL-3.0 license.
What language is allosaurus written in?: xinjli/allosaurus is primarily written in Python.
How popular is allosaurus?: xinjli/allosaurus has 737 stars on GitHub.
Where can I find allosaurus?: xinjli/allosaurus is on GitHub at https://github.com/xinjli/allosaurus.