← all repositories

NVIDIA-NeMo/Gym

A library for evaluating and improving AI agents and language models across scalable, stateful environments with shared benchmarks and verifiers.

966 stars Python AgentsLLMOps · Eval
Gym
Velocity · 7d
+3.4
★ / day
Trend
steady
star history

NeMo Gym provides infrastructure to develop environments for AI agents, run scalable evaluation and training, and access a collection of benchmark environments. An environment comprises a dataset of tasks, an agent harness defining model-world interaction, a verifier for scoring task completion, and per-task execution state. It supports transitioning between evaluation, agent optimization, and training workflows at scale.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.