PrimeIntellect-ai/prime-rl
A framework for large-scale reinforcement learning that trains autonomous agents using asynchronous RL on distributed GPU infrastructure.

PRIME-RL is a reinforcement learning training framework designed for agentic systems at scale. It provides fully asynchronous RL training capable of scaling to 1000+ GPUs, supporting mixture-of-experts models with distributed training strategies including FSDP2, EP, and CP parallelism. The framework integrates vLLM for inference, supports FP8 inference and PD disaggregation, and includes native support for agentic and SWE environments through the Environments Hub. It offers end-to-end post-training capabilities including SFT, RL training, and evaluations.