microsoft/KBLaM
A method for augmenting large language models with knowledge bases without external retrieval modules, presented at ICLR 2025.

KBLaM provides an alternative to retrieval-augmented generation that eliminates external retrieval modules entirely. Instead, it integrates external knowledge directly into the LLM architecture, achieving linear computational scaling with knowledge base size rather than the quadratic scaling of in-context learning approaches. The implementation supports popular Hugging Face models including Llama-3 and Phi-3, and includes tools for generating synthetic knowledge bases and embedding them for use with the augmented language models.