← all repositories
hundredblocks/concrete_NLP_tutorial

NLP workshop from 2017 still worth your time?

A hands-on tutorial that skips the theory sermon and jumps straight to word vectors and character-level RNNs on real Yelp data.

1.1k stars Jupyter Notebook LearningLanguage Models
concrete_NLP_tutorial
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

This is a Jupyter notebook workshop from ODSC 2017 built around two concrete tasks: loading pretrained Google News word2vec vectors through gensim, and running a character-level RNN trained on Yelp reviews. The README is sparse — clone, download two external data files, and follow along. No framework manifesto, no cloud deployment guide.

The interesting bit

The “concrete” in the title is doing honest work. The repo assumes you already care about NLP and just want to see working code for embeddings and text generation without a 40-slide preamble on the history of neural networks.

Key highlights

  • Uses gensim’s built-in downloader for word2vec-google-news-300 (no manual wrestling with Google’s CDN)
  • Includes a pretrained char-RNN for Yelp, dated September 2017, saved as an HDF5 weight file
  • Notebook format means you can execute cell-by-cell or run end-to-end
  • 1,078 stars suggests it found an audience that appreciated the directness

Caveats

  • The pretrained Yelp model lives on an S3 bucket with a hardcoded URL; if that bucket goes away, the workshop breaks
  • No requirements.txt or environment specification in the README — you’ll need to infer dependencies from the notebook itself
  • Word2vec and char-RNN were state-of-the-art-adjacent in 2017; modern practitioners will want transformers

Verdict

Good for someone who learns by reverse-engineering working code, or who needs to explain embeddings to a skeptical team with a live demo. Skip it if you want current architectures, dependency hygiene, or a maintained project.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.