← all repositories

uclaml/SPIN

SPIN is a self-play fine-tuning technique that iteratively improves language models by having them compete against previous versions.

1.2k stars Python Language ModelsML Frameworks
SPIN
Velocity · 7d
+1.5
★ / day
Trend
steady
star history

This repository provides the official implementation of a self-play fine-tuning method for large language models. The approach enables weaker models to improve by generating training data against stronger previous versions of themselves through iterative competition. The paper was published at ICML 2024 and includes training scripts, models (Zephyr-7B variants), and evaluation benchmarks for Open LLM Leaderboard and MT-Bench.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.