← all repositories
oreilly-japan/deep-learning-from-scratch-2

A Japanese NLP course where you build RNNs by hand

Companion code for a book that teaches deep learning for natural language processing from scratch, with minimal dependencies.

1.2k stars Python LearningLanguage Models
deep-learning-from-scratch-2
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

This repository holds the chapter-by-chapter source code for ゼロから作る Deep Learning ❷ (“Deep Learning from Scratch 2: Natural Language Processing”), an O’Reilly Japan book published in 2018. Each folder maps to a book chapter, plus shared utilities in common/ and datasets in dataset/. You get word embeddings, RNNs, LSTMs, and attention mechanisms implemented in plain NumPy, with optional GPU acceleration via CuPy.

The interesting bit

The book’s premise is pedagogical masochism done right: you write the forward and backward passes yourself rather than calling torch.nn.LSTM. The code stays readable because it avoids framework magic — you can actually see where the gradient flows. Pre-trained weights for chapters 6 and 7 are hosted separately, so you can skip the long training runs and still experiment.

Key highlights

  • Eight chapters of incremental NLP implementations, from word2vec to attention-based models
  • Pure NumPy + Matplotlib core; SciPy and CuPy are strictly optional
  • Pre-trained weight file (BetterRnnlm.pkl) available for chapters 6–7
  • MIT licensed, explicitly cleared for commercial use
  • Active errata page maintained by the publisher

Caveats

  • All documentation and comments are in Japanese; English speakers will need translation help or tolerance
  • Published in 2018, so architectures and best practices (Transformers, modern tokenizers) are absent
  • The README notes that code explanations live in the book, not the repo — this is strictly companion material

Verdict

Grab this if you’re learning Japanese and want to understand RNN internals by typing them out, or if you teach and need clean, dependency-light reference implementations. Skip it if you want production-grade frameworks or modern Transformer-based NLP.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.