← all repositories
nlptown/nlp-notebooks

NLP notebooks that skip the textbook preamble

A curated collection of runnable notebooks covering word embeddings to BERT, aimed at developers who want to learn by breaking things.

1k stars Jupyter Notebook Learning
nlp-notebooks
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does This repo is a set of Jupyter notebooks from NLP Town that walk through core NLP tasks: word embeddings, named entity recognition, text classification, sentence similarity, multilingual transfer, and classic sequence labeling. Each notebook is a self-contained tutorial pairing a technique with runnable code, usually in spaCy, PyTorch, scikit-learn, or Keras.

The interesting bit The progression is deliberate. You start with LDA topic modeling and CRFs, then move through BiLSTMs and ELMo, and end at BERT and zero-shot classification. It is a time capsule of NLP’s last decade in notebook form — useful for understanding why we landed on transformers without pretending the older methods never existed.

Key highlights

  • 15 notebooks across six topic areas, from “NLP 101” to transfer learning
  • Covers both “traditional” scikit-learn pipelines and transformer fine-tuning
  • Includes a full multilingual track: cross-lingual similarity and transfer learning with BERT
  • Medical NER notebook applies pretrained transformers to a specialized domain
  • Zero-shot and intent classification notebooks address practical deployment scenarios

Caveats

  • README is a bare list of links; no installation instructions, dependency versions, or tested environment specified
  • Some notebooks reference older stacks (ELMo, StanfordNLP) that may need pinning to run today
  • No continuous integration or execution badges visible, so bit-rot is a real risk

Verdict Good for practitioners who need a quick, code-first refresher on a specific technique, or for teams onboarding junior NLP engineers. Skip it if you want a unified framework or production-ready pipelines; this is a reference shelf, not a product.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.