Is malaya open source?

Yes — malaysia-ai/malaya is open source, released under the MIT license.

What language is malaya written in?

malaysia-ai/malaya is primarily written in Jupyter Notebook.

How popular is malaya?

malaysia-ai/malaya has 527 stars on GitHub.

Where can I find malaya?

malaysia-ai/malaya is on GitHub at https://github.com/malaysia-ai/malaya.

← all repositories

malaysia-ai/malaya

NLP for a language the big toolkits forgot

Malaya gives Malaysian developers first-class PyTorch models for tasks that NLTK and spaCy barely touch.

★527 stars Jupyter Notebook ML Frameworks Language Models

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Malaya is a PyTorch-based NLP toolkit purpose-built for bahasa Malaysia. It covers the standard suspects—sentiment analysis, named entity recognition, POS tagging, language detection, text normalization—but tuned for Malaysian Malay rather than borrowed from English models and hoping for the best.

The interesting bit

The project is old enough to have TensorFlow in its topic tags and young enough to have switched to PyTorch, which suggests actual maintenance rather than abandonware. Pretrained models live on HuggingFace under the mesolitica org, so you’re not stuck training from scratch on a low-resource language.

Key highlights

Supports Python 3.6+ and PyTorch 1.10+; leaves PyTorch installation to you so you pick CPU or GPU
Models hosted at huggingface.co/mesolitica
Windows users get dedicated docs (always a tell that someone has suffered)
Research-backed: includes a BibTeX citation and acknowledges TFRC TPU access, suggesting serious training runs
Active enough to have a Discord community

Caveats

The README is thin on specifics: no model sizes, no benchmark numbers, no latency claims
Jupyter Notebook as the repo language suggests heavy docs/examples; the actual library structure is unclear from the README alone
“Entity framework” in the GitHub topics appears to be a tag misfire, not an ORM

Verdict

Worth a look if you’re building Malay-language products and tired of forcing multilingual models to cope with local slang and syntax. Skip it if your use case is English-dominant; you already have better-supported options.

Frequently asked

What is malaysia-ai/malaya?: Malaya gives Malaysian developers first-class PyTorch models for tasks that NLTK and spaCy barely touch.
Is malaya open source?: Yes — malaysia-ai/malaya is open source, released under the MIT license.
What language is malaya written in?: malaysia-ai/malaya is primarily written in Jupyter Notebook.
How popular is malaya?: malaysia-ai/malaya has 527 stars on GitHub.
Where can I find malaya?: malaysia-ai/malaya is on GitHub at https://github.com/malaysia-ai/malaya.