← all repositories

jquesnelle/yarn

A method for extending the context window length of large language models using modified rotary position embeddings.

1.7k stars Python Language ModelsML Frameworks
yarn
Velocity · 7d
+1.6
★ / day
Trend
steady
star history

YaRN provides a technique for extending the context window of LLMs (LLaMA, Mistral, SOLAR) beyond their original training length. The approach modifies Rotary Position Embeddings (RoPE) to allow models to handle longer contexts without catastrophic degradation. The repository publishes fine-tuned model weights on Hugging Face with 32K-128K context windows, and includes the training code and evaluation data from the ICLR 2024 paper.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.