GPT-2 writes the dictionary, badly
A finetuned language model that invents convincing fake words, then defines them with straight-faced authority.

What it does
This project trains a GPT-2 variant to generate dictionary entries from thin air: a made-up word, its part of speech, a plausible definition, and even a usage example. It can also run in reverse, conjuring a neologism to match a definition you provide. The author runs a live demo site and a Twitter bot that posts the results.
The interesting bit
The trick isn’t just generating gibberish—it’s the bidirectional setup. There’s a “forward” model (word → definition) and an “inverse” model (definition → word), plus a blacklist to filter out real words that slip through. The training data comes from scraping Apple’s built-in dictionaries or Urban Dictionary, which gives the model a surprisingly formal register to parody.
Key highlights
- Pre-trained models and a
WordGeneratorclass for one-line inference - Supports CPU inference with optional quantization
- Training pipeline included, with scrapers for Apple and Urban Dictionary sources
- Live demo at thisworddoesnotexist.com and a Twitter bot (@robo_define)
- Sample notebooks and a full training script in the repo
Caveats
- The README has a typo in the very first sentence (“This is a project allows people”), which feels appropriate
- No benchmark or evaluation metrics provided; quality is “try it and see”
- Model files are hosted on Google Cloud Storage with no versioning beyond “v1”
Verdict
Good for NLP tinkerers who want a concrete, amusing finetuning project, or anyone building creative text generators. Skip it if you need rigorous evaluation or production-grade text generation.