Is Recognizers-Text open source?

Yes — microsoft/Recognizers-Text is open source, released under the MIT license.

What language is Recognizers-Text written in?

microsoft/Recognizers-Text is primarily written in C#.

How popular is Recognizers-Text?

microsoft/Recognizers-Text has 1.8k stars on GitHub.

Where can I find Recognizers-Text?

microsoft/Recognizers-Text is on GitHub at https://github.com/microsoft/Recognizers-Text.

← all repositories

microsoft/Recognizers-Text

Microsoft's battle-tested entity parser you probably already use

Extract numbers, dates, units, and sequences from messy human text across 14 languages — the same engine behind LUIS and Bot Framework.

★1.8k stars C# Language Models Data Tooling

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Microsoft.Recognizers.Text turns unstructured text into structured entities: cardinal numbers, ordinals, percentages, currency, dimensions, temperatures, ages, date/time expressions, plus sequences like emails, URLs, phone numbers, and GUIDs. It handles the messy reality of human language — “next Tuesday at 3pm”, “fifty bucks”, “1.5 meters” — and normalizes them into machine-readable form.

The interesting bit

This isn’t a research prototype quietly rusting in a repo. It’s the actual extraction engine powering LUIS, Power Virtual Agents, Microsoft Bot Framework, and Text Analytics Cognitive Service. The project ships for .NET, JavaScript/TypeScript, Python (alpha), and Java (in progress), with .NET as the primary target where new features land first.

Key highlights

Full support for 10 languages: Chinese, English, French, Spanish, Portuguese, German, Italian, Turkish, Hindi, Dutch
Partial support for Japanese, Korean, Arabic, Swedish; Bulgarian has boolean support only
15 entity types with varying depth — from generic regex sequences (emails, GUIDs) to fully resolved date/time with subtypes
NuGet, NPM, and PyPI packages available; academic citation BibTeX provided (a nice touch for the rare open-source project that expects to be cited)
Active contribution paths: open issues, NotSupported spec cases, and translating English test specs to new languages

Caveats

Support matrix is genuinely lopsided: Korean DateTime is “specs-only” (tests written, code pending); Arabic units are entirely unsupported; Swedish phone numbers are a no-go
Python and Java ports lag behind .NET, so cross-platform parity is aspirational, not guaranteed
README warns that contribution guides “may have become a little out-of-date”

Verdict

Grab this if you’re building chatbots, parsing forms, or doing any NLP that needs reliable entity extraction without training your own model. Skip it if you need bleeding-edge language coverage (especially Arabic, Korean, or Bulgarian) or if you require guaranteed parity across all four platform ports.

Frequently asked

What is microsoft/Recognizers-Text?: Extract numbers, dates, units, and sequences from messy human text across 14 languages — the same engine behind LUIS and Bot Framework.
Is Recognizers-Text open source?: Yes — microsoft/Recognizers-Text is open source, released under the MIT license.
What language is Recognizers-Text written in?: microsoft/Recognizers-Text is primarily written in C#.
How popular is Recognizers-Text?: microsoft/Recognizers-Text has 1.8k stars on GitHub.
Where can I find Recognizers-Text?: microsoft/Recognizers-Text is on GitHub at https://github.com/microsoft/Recognizers-Text.