← all repositories
minosvasilias/godot-dodo

Teaching LLaMA to speak Godot, not Python with a hat

A finetuning pipeline that uses real human GDScript code and GPT-3.5 only for labeling, producing models that reportedly beat GPT-4 on syntax accuracy for the niche game engine language.

godot-dodo
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does

godot-dodo scrapes MIT-licensed Godot repositories from GitHub, splits .gd files into functions, and uses gpt-3.5-turbo to auto-generate descriptive comments for each one. The resulting comment:code pairs finetune LLaMA variants (7B and 13B) specifically for GDScript generation. The repo includes the full dataset assembly scripts, pre-built 60k-row datasets, finetuning configs, and published model weights.

The interesting bit

The twist is what the LLM isn’t used for. Unlike Alpaca-style approaches that synthesize training outputs from larger models, this project uses human-written code exclusively for the code side, with GPT-3.5 only as a labeler. The README claims the resulting models achieve “significantly greater consistency” than GPT-4/GPT-3.5-turbo on GDScript syntax, with code-specific base models even outperforming on complex instructions.

Key highlights

  • Full pipeline from GitHub scraping to published HuggingFace weights
  • Pre-assembled 60k-row dataset for Godot 4.x projects; 3.x data noted as future work
  • Training cost: $30 for dataset labeling, $24–$84 for finetuning runs on 8x A100 80GB GPUs
  • Explicitly MIT-only code sourcing, with all scraped projects listed for attribution
  • Colab inference notebook included for trying models without local GPU

Caveats

  • The README identifies a real weakness: models learn to reference objects initialized outside function scope, producing code that assumes unimplemented context
  • Hardware requirements are steep; the 13B model needed eight A100s and NVLink is recommended over PCIe
  • Godot 3.x dataset and model variants are missing despite the methodology being language-agnostic

Verdict

Worth a look if you’re building Godot tooling or exploring domain-specific LLM finetuning on a budget. Skip if you need general-purpose code generation or lack access to serious GPU hardware.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.