uclaml/SPIN
SPIN is a self-play fine-tuning technique that iteratively improves language models by having them compete against previous versions.

Velocity · 7d
+1.5
★ / day
Trend
→steady
star history
This repository provides the official implementation of a self-play fine-tuning method for large language models. The approach enables weaker models to improve by generating training data against stronger previous versions of themselves through iterative competition. The paper was published at ICML 2024 and includes training scripts, models (Zephyr-7B variants), and evaluation benchmarks for Open LLM Leaderboard and MT-Bench.