AkariAsai/self-rag
Research implementation of a LLM training framework that learns to retrieve, generate, and self-critique via reflection tokens.

SELF-RAG trains language models to perform on-demand retrieval and self-critique during generation, improving factuality without sacrificing versatility. The framework predicts special reflection tokens to decide when to retrieve and how to evaluate outputs across multiple fine-grained aspects. It uses segment-wise beam search to optimize generation quality according to diverse preferences. Trained LLaMA2 models (7B and 13B) are publicly available.