← all repositories

langfengQ/verl-agent

A reinforcement learning framework for training large language model and vision-language model agents using group-in-group policy optimization.

verl-agent
Velocity · 7d
+4.5
★ / day
Trend
steady
star history

verl-agent extends the veRL framework to enable training LLM and VLM agents via reinforcement learning. It introduces a step-independent multi-turn rollout mechanism that allows fully customizable per-step input structures and history management, replacing traditional full-interaction concatenation. The project implements Group-in-Group Policy Optimization (GIGPO) for improved agent training, as published at NeurIPS 2025.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.