← all repositories

thunlp/InfLLM

A training-free memory mechanism that enables pre-trained LLMs to process extremely long sequences without fine-tuning.

InfLLM
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

InfLLM stores distant contexts in additional memory units and employs an efficient lookup mechanism to retrieve token-relevant units for attention computation. This allows LLMs pre-trained on shorter sequences to process long inputs while maintaining the ability to capture long-distance dependencies. The method uses faiss for retrieval and requires no training or fine-tuning, making it directly applicable to LLM-driven agents with lengthy streaming inputs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.