← all repositories
turtlesoupy/this-word-does-not-exist

GPT-2 writes the dictionary, badly

A finetuned language model that invents convincing fake words, then defines them with straight-faced authority.

1k stars Python Language Models
this-word-does-not-exist
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

This project trains a GPT-2 variant to generate dictionary entries from thin air: a made-up word, its part of speech, a plausible definition, and even a usage example. It can also run in reverse, conjuring a neologism to match a definition you provide. The author runs a live demo site and a Twitter bot that posts the results.

The interesting bit

The trick isn’t just generating gibberish—it’s the bidirectional setup. There’s a “forward” model (word → definition) and an “inverse” model (definition → word), plus a blacklist to filter out real words that slip through. The training data comes from scraping Apple’s built-in dictionaries or Urban Dictionary, which gives the model a surprisingly formal register to parody.

Key highlights

  • Pre-trained models and a WordGenerator class for one-line inference
  • Supports CPU inference with optional quantization
  • Training pipeline included, with scrapers for Apple and Urban Dictionary sources
  • Live demo at thisworddoesnotexist.com and a Twitter bot (@robo_define)
  • Sample notebooks and a full training script in the repo

Caveats

  • The README has a typo in the very first sentence (“This is a project allows people”), which feels appropriate
  • No benchmark or evaluation metrics provided; quality is “try it and see”
  • Model files are hosted on Google Cloud Storage with no versioning beyond “v1”

Verdict

Good for NLP tinkerers who want a concrete, amusing finetuning project, or anyone building creative text generators. Skip it if you need rigorous evaluation or production-grade text generation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.