← all repositories

Gen-Verse/Open-AgentRL

RLAnything is a reinforcement learning framework that jointly optimizes policy and reward models for LLMs and agents in dynamic environments.

Open-AgentRL
Velocity · 7d
+2.3
★ / day
Trend
steady
star history

The repository implements RLAnything, a closed-loop RL system that dynamically optimizes policy models using outcome and step-wise reward signals, while jointly training reward models via consistency feedback. It also includes DemyAgent, a general agentic RL agent. The framework supports terminal, GUI, SWE, and tool-call settings, supporting PPO, GRPO, and entropy-based training methods.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.