Is cnn-text-classification-pytorch open source?

Yes — Shawn1993/cnn-text-classification-pytorch is open source, released under the Apache-2.0 license.

What language is cnn-text-classification-pytorch written in?

Shawn1993/cnn-text-classification-pytorch is primarily written in Python.

How popular is cnn-text-classification-pytorch?

Shawn1993/cnn-text-classification-pytorch has 1k stars on GitHub.

Where can I find cnn-text-classification-pytorch?

Shawn1993/cnn-text-classification-pytorch is on GitHub at https://github.com/Shawn1993/cnn-text-classification-pytorch.

← all repositories

Shawn1993/cnn-text-classification-pytorch

A 2014 paper, resurrected for PyTorch 2.0

This repo modernizes Yoon Kim's classic CNN sentence classifier without letting it drift from the original results.

★1k stars Python Language Models ML Frameworks

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does Implements Kim’s 2014 “Convolutional Neural Networks for Sentence Classification” in modern PyTorch. It trains 1-D convolutions over word embeddings for binary (MR) or fine-grained 5-class (SST) sentiment classification, with a CLI for training, testing, and one-off prediction.

The interesting bit The value is in the archaeology: the author didn’t just port the model, they stripped out deprecated torchtext APIs, fixed half a dozen RuntimeErrors and IndexErrors that accumulate when old PyTorch code meets 2.0+, and added the missing L2 constraint on the fully-connected layer that Kim’s original actually used. Results match the paper: 76.5% on MR, 45.6% on SST.

Key highlights

Reproduces Kim’s CNN-rand baseline within ~0.5% on both MR and SST datasets
Removed torchtext dependency; uses standard Dataset/DataLoader with a custom BucketSampler
Supports both Adam and Adadelta optimizers, plus phrase-level SST training data
CLI includes train/test/predict modes with snapshot loading
Requires only PyTorch ≥2.0 and Python ≥3.8

Caveats

Input text must be space-separated, even punctuation, and longer than your largest kernel size or it breaks
Only two datasets (MR, SST) are bundled; you’ll need to wire your own data loader for anything else
The 2017 snapshot path in the README predict examples is stale; you’ll need to point to your own trained model

Verdict Useful if you need a clean, reproducible baseline for sentence classification or you’re teaching the Kim paper and want code that actually runs on modern PyTorch. Skip it if you need multilingual support, transformer baselines, or a production-ready pipeline.

Frequently asked

What is Shawn1993/cnn-text-classification-pytorch?: This repo modernizes Yoon Kim's classic CNN sentence classifier without letting it drift from the original results.
Is cnn-text-classification-pytorch open source?: Yes — Shawn1993/cnn-text-classification-pytorch is open source, released under the Apache-2.0 license.
What language is cnn-text-classification-pytorch written in?: Shawn1993/cnn-text-classification-pytorch is primarily written in Python.
How popular is cnn-text-classification-pytorch?: Shawn1993/cnn-text-classification-pytorch has 1k stars on GitHub.
Where can I find cnn-text-classification-pytorch?: Shawn1993/cnn-text-classification-pytorch is on GitHub at https://github.com/Shawn1993/cnn-text-classification-pytorch.