← all repositories
thunlp/OpenDelta

Fine-tune giants by thawing only the frosting

OpenDelta lets you train massive language models by updating just a sliver of parameters, leaving the rest frozen solid.

OpenDelta
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

What it does

OpenDelta is a Python toolkit for “delta tuning” — parameter-efficient methods like LoRA, adapters, and prefix-tuning that let you specialize huge pre-trained models without touching most of their weights. You freeze the backbone, inject small trainable modules, and save only those deltas when you’re done. The library wraps around Hugging Face transformers and claims to work with any PyTorch backbone, though it only verifies support for common models.

The interesting bit

The detach() method is a nice touch: yank the delta modules out and your model reverts to its original behavior instantly, like nothing happened. The README also shows regex-based targeting of specific layers (e.g., LoRA on just the last four decoder blocks), which is where the real savings live — not in the defaults, but in surgical precision.

Key highlights

  • Supports LoRA, adapters, prefix-tuning, soft prompt tuning, and custom delta methods
  • AutoDeltaModel can load pretrained deltas from a “Delta Center” hub or local paths
  • Saves only delta parameters (example: 1.4 MB vs. full T5-large)
  • Regex-based module addressing for fine-grained control
  • Works with wrapped/custom models that contain standard PLMs as submodules

Caveats

  • Soft Prompt Tuning and Prefix Tuning had bugs as of March 2022 related to custom attention and token type IDs; the README suggests avoiding them
  • “Delta Center” hub webpage was “still under construction” as of the last update
  • Version 0.3.2 is pinned to Python 3.8.13, PyTorch 1.12.1, and transformers 4.22.2 — newer versions are “likely” to work but not guaranteed

Verdict

Worth a look if you’re already in the Hugging Face ecosystem and need to ship many fine-tuned variants without many GPU-years. Skip it if you need rock-solid prompt-tuning support today, or if you’re not willing to debug layer-name regexes when the defaults miss.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.