Is language-detection open source?

Yes — patrickschur/language-detection is open source, released under the MIT license.

What language is language-detection written in?

patrickschur/language-detection is primarily written in PHP.

How popular is language-detection?

patrickschur/language-detection has 855 stars on GitHub.

Where can I find language-detection?

patrickschur/language-detection is on GitHub at https://github.com/patrickschur/language-detection.

← all repositories

patrickschur/language-detection

PHP language detection without calling Google Translate

A self-contained n-gram library that trains on 110 languages and runs entirely offline.

★855 stars PHP Data Tooling

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Feed it a string of text, get back ranked language guesses with confidence scores. It ships with pre-trained models for 110 languages and a Trainer class to roll your own—whether that’s Klingon, spam vs. ham, or something more practical.

The interesting bit

The library compiles n-gram frequency data into plain PHP arrays rather than JSON (since v4), which is a blunt-force but effective way to dodge parse overhead. You can also cap the n-gram count to trade accuracy for speed, or whitelist specific languages to skip comparisons you don’t need.

Key highlights

110 built-in languages, with trainable support for custom ones
Method chaining: detect()->blacklist('de')->limit(3)->close()
ArrayAccess lets you pluck scores like $result['nl']
Custom tokenizers via TokenizerInterface for domain-specific text
Requires PHP ≥ 7.4 and the mbstring extension

Caveats

Needs “some sentences” for reliable detection; short strings are dicey
Training with large n-gram counts (the README suggests ~9,000 for better accuracy) is slow, though detection speed stays flat
Upgrading from v3 requires regenerating custom training files from JSON to PHP

Verdict

Worth a look if you’re building a PHP app that needs offline language detection without pulling in heavy ML dependencies. Skip it if you’re already running Python or need real-time detection on single words.

Frequently asked

What is patrickschur/language-detection?: A self-contained n-gram library that trains on 110 languages and runs entirely offline.
Is language-detection open source?: Yes — patrickschur/language-detection is open source, released under the MIT license.
What language is language-detection written in?: patrickschur/language-detection is primarily written in PHP.
How popular is language-detection?: patrickschur/language-detection has 855 stars on GitHub.
Where can I find language-detection?: patrickschur/language-detection is on GitHub at https://github.com/patrickschur/language-detection.