← all repositories
graykode/gpt-2-Pytorch

GPT-2 in PyTorch: the 2019 glue-code special

A stripped-down PyTorch wrapper around Hugging Face's converted GPT-2 weights, back when OpenAI still hadn't released the full model.

1k stars Python Language ModelsML Frameworks
gpt-2-Pytorch
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does Downloads a pre-converted PyTorch GPT-2 checkpoint from Hugging Face and runs text generation via a compact main.py. You feed it a seed string—Orwell’s 1984 is the documented example—and it autocompletes with configurable temperature, top-k sampling, and length. There’s also a Google Colab notebook for zero-install tries.

The interesting bit The README is admirably honest about what this actually is: “compress code” that leans entirely on Hugging Face’s TensorFlow-to-PyTorch conversion. The author even thanks them for “help my problem transferring tensorflow(ckpt) file to Pytorch Model.” In 2019, that conversion was genuinely useful plumbing; today it’s historical curiosity.

Key highlights

  • Single-file-ish wrapper around Hugging Face’s gpt2-pytorch_model.bin
  • Supports conditional and unconditional generation (--unconditional)
  • Tunable sampling: temperature (default 0.7), top-k (default 40), batch size, sequence length
  • macOS setup instructions included (needs libomp, venv dance)
  • PyTorch 0.4.1+ dependency dates this firmly to the pre-1.0 era

Caveats

  • Requires manually curl-ing a ~500MB model file; no automated download
  • PyTorch 0.4.1 is ancient; expect friction on modern environments
  • The “compress code” claim is relative—it’s simple, not novel architecture work

Verdict Worth a look if you’re考古-ing (archaeologizing?) early GPT-2 ecosystem tooling or need a minimal, no-abstraction generation script. Skip it if you want training, fine-tuning, or anything post-2020—Hugging Face’s transformers library made this entirely redundant.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.