Is awesome-embedding-models open source?

Yes — Hironsan/awesome-embedding-models is open source, released under the MIT license.

What language is awesome-embedding-models written in?

Hironsan/awesome-embedding-models is primarily written in Jupyter Notebook.

How popular is awesome-embedding-models?

Hironsan/awesome-embedding-models has 1.8k stars on GitHub.

Where can I find awesome-embedding-models?

Hironsan/awesome-embedding-models is on GitHub at https://github.com/Hironsan/awesome-embedding-models.

← all repositories

Hironsan/awesome-embedding-models

A reading list for people who miss word2vec

A curated index of embedding-model papers, tools, and pre-trained vectors from the era before LLMs ate everything.

★1.8k stars Jupyter Notebook Learning Language Models

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does This repo is a classic “awesome list” — a hand-maintained index of resources for word, sentence, and document embeddings. It catalogs foundational papers (word2vec, GloVe, FastText, BERT, ELMo), key researchers, courses, datasets, and pre-trained model links. Think of it as a bibliography with working hyperlinks.

The interesting bit The list is frozen in a specific moment: 2018, when contextual embeddings like ELMo and BERT were arriving but before the transformer tsunami fully hit. It captures the transition from static word vectors to contextualized representations, with an entire section debating whether count-based or prediction-based methods win — a fight that now feels almost quaint.

Key highlights

Heavyweight paper coverage: Mikolov’s word2vec series, GloVe, FastText, plus the first BERT and ELMo papers
Pre-trained vector links for 157 languages via FastText, plus biomedical specials (BioWordVec, BioSentVec)
Curated researcher list (Mikolov, Bengio, Goldberg, Levy, Chen) with Google Scholar links
Evaluation datasets and papers questioning whether word-similarity tasks actually predict downstream performance
Implementation links for gensim, TensorFlow word2vec tutorials, and a GPU-optimized GloVe layer

Caveats

Last substantive update appears to be circa 2018; no modern sentence transformers, no OpenAI embeddings, no retrieval-augmented generation
The “Articles” section is commented out in the source, suggesting unfinished maintenance
Some TensorFlow links point to r0.12 documentation, which is archaeological at this point

Verdict Worth a bookmark if you’re doing historical NLP research, teaching an embeddings course, or need a quick reference to the pre-transformer canon. Skip it if you want practical guidance on modern vector search or API-based embedding services — this is a museum, not a manual.

Frequently asked

What is Hironsan/awesome-embedding-models?: A curated index of embedding-model papers, tools, and pre-trained vectors from the era before LLMs ate everything.
Is awesome-embedding-models open source?: Yes — Hironsan/awesome-embedding-models is open source, released under the MIT license.
What language is awesome-embedding-models written in?: Hironsan/awesome-embedding-models is primarily written in Jupyter Notebook.
How popular is awesome-embedding-models?: Hironsan/awesome-embedding-models has 1.8k stars on GitHub.
Where can I find awesome-embedding-models?: Hironsan/awesome-embedding-models is on GitHub at https://github.com/Hironsan/awesome-embedding-models.