jquesnelle/yarn
A method for extending the context window length of large language models using modified rotary position embeddings.

Velocity · 7d
+1.6
★ / day
Trend
→steady
star history
YaRN provides a technique for extending the context window of LLMs (LLaMA, Mistral, SOLAR) beyond their original training length. The approach modifies Rotary Position Embeddings (RoPE) to allow models to handle longer contexts without catastrophic degradation. The repository publishes fine-tuned model weights on Hugging Face with 32K-128K context windows, and includes the training code and evaluation data from the ICLR 2024 paper.