← all repositories
CodedotAl/gpt-code-clippy

Open-source Copilot clone admits: most of our models score zero

A community effort to replicate GitHub Copilot that publishes its training recipes, its failures, and its honest confusion about which model to use.

gpt-code-clippy
Velocity · 7d
+1.8
★ / day
Trend
steady
star history

What it does GPT-Code-Clippy fine-tunes GPT-2 and GPT-Neo on scraped GitHub code to generate code completions. It ships a VS Code extension, a HuggingFace demo, and a 159GB deduplicated training dataset built from SEART GitHub Search plus The Pile. The project is explicitly framed as an open-source answer to GitHub Copilot.

The interesting bit The README’s candor is the feature. The authors publish HumanEval results showing their fine-tuned models scoring 0.00% on pass@1 through pass@10, note that “None improve on the standard GPT-Neo 125M model except for APPs specific models,” and leave TODOs asking which model is recommended and how to train properly. This is less a product than a public lab notebook.

Key highlights

  • Dataset filtered by regex deduplication on alphanumeric “variables,” with source code and a datasheet available
  • Training hyperparameters fully documented: AdamW with GPT-3-style cosine decay for CodeClippy, Adafactor for 1.3B APPS fine-tuning “in part determined by hardware limitations”
  • VS Code extension exists but relies on HuggingFace Inference API
  • Multiple model variants on HuggingFace Hub, including 125M and 1.3B parameter sizes
  • Active issue tracking a data bug where wrong filenames may have corrupted language filtering

Caveats

  • HumanEval results show base GPT-Neo outperforming all CodeClippy variants; several models score literally zero
  • A known dataset bug means file extensions used for language filtering may be wrong, with unknown impact on training data quality
  • README contains multiple TODOs and no clear guidance on which model or training path to follow

Verdict Worth following if you’re researching open-source code generation or want to see how a community project documents its stumbles in real time. Skip if you need a working Copilot replacement today.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.