1.2 billion tweets taught this model to read emotion
A PyTorch port of DeepMoji that turns text into 2304-dimensional emotional feature vectors, pretrained on emoji-labeled Twitter data.

What it does
TorchMoji is a PyTorch reimplementation of MIT’s DeepMoji model. It was trained on 1.2 billion tweets that contained emojis, learning to associate language patterns with emotional expression. You can use it to extract emoji predictions from text, generate dense emotional feature vectors, or fine-tune the model on your own sentiment or sarcasm detection tasks.
The interesting bit
The emoji-supervision angle is clever: instead of expensive human emotion labels, the model uses emoji usage as a massive, noisy but abundant proxy for affect. The resulting representations transfer well to other emotion-related NLP tasks. Hugging Face’s blog post on the Keras-to-PyTorch port is also a decent read if you’re into model archaeology.
Key highlights
- Pretrained weights (~85MB) available via download script; vocabulary and model included in repo
- Outputs 2304-dimensional emotional feature vectors for arbitrary text
- Examples cover scoring texts for emoji relevance, encoding, and fine-tuning on new datasets
- Unit tests included, with optional slow tests for finetuning accuracy verification
- MIT licensed
Caveats
- CUDA support is noted as inefficient; the model “can’t make efficient use of CUDA” per the README
- Python support tops out at 3.5; code is explicitly tested on 2.7 and 3.5 only
- The authors disclaim optimization for efficiency and offer no bug-free guarantees
- Last meaningful update appears to be September 2018
Verdict
Worth a look if you need off-the-shelf emotional text representations and can tolerate dated PyTorch. Skip it if you need modern Python support, GPU efficiency, or a maintained codebase—this is research code that has been quietly aging.