Microsoft's chatbot that learned from Reddit — and its own successor
DialoGPT was an early GPT-2-based dialogue model, but Microsoft now tells you to use GODEL instead.

What it does
DialoGPT generates conversational responses by fine-tuning GPT-2 on 147 million multi-turn Reddit dialogues. The repo bundles data extraction scripts, training code, and three pretrained checkpoints (117M, 345M, 762M parameters) that you can download and fine-tune. A demo.py script attempts to paper over the setup pain by automating model downloads, data prep, and training in one command.
The interesting bit
The README opens with a blunt admission: this project is “no longer maintained” and superseded by GODEL, which “outperforms DialoGPT.” It’s rare to see a research repo actively discourage its own use unless you’re chasing reproducibility. The model also passed a single-turn Turing test against human responses — though the sources don’t detail how rigorous that test was.
Key highlights
- Three model sizes, with the 762M variant needing >16GB GPU memory for efficient training
- Distributed training supported; 8 V100s cut epoch time from 118h to 18h on benchmark data
- Docker and Conda environments provided, but Ubuntu 16.04 is the only officially supported OS
- Hugging Face model cards available for easier interactive use
- Includes DSTC-7 challenge reproduction scripts and a 6k multi-reference test set
Caveats
- Data pipeline broke in 2022 due to Pushshift server changes; fix requires 800GB temp disk space and ~10 hours with 8 processes
- “Stability can not be gauranteed” on non-Linux platforms (their spelling, not mine)
- FP16 training requires installing a specific pinned commit of NVIDIA Apex
Verdict
Worth a look if you’re reproducing 2019 dialogue generation papers or need a GPT-2-based baseline. Everyone else should follow Microsoft’s advice and head to GODEL. The repo’s real value may be as a time capsule of early large-scale dialogue pretraining — and as a case study in graceful project obsolescence.