Alibaba-NLP/ZeroSearch
ZeroSearch trains large language models to develop search capabilities using simulation-based reinforcement learning without real search engines.

ZeroSearch is a training framework that incentivizes LLMs to acquire search-like reasoning capabilities through reinforcement learning with simulation LLMs. Rather than using actual search engines during training, it trains policy models on simulated search environments, then deploys them with real search APIs. The project releases policy models, simulation LLMs, and datasets compatible with Wikipedia and Google Search on Hugging Face.