← all repositories

NVlabs/Long-RL

A full-stack framework for scaling reinforcement learning training of vision-language models to long video reasoning.

Long-RL
Velocity · 7d
+2.2
★ / day
Trend
steady
star history

Long-RL addresses the challenge of applying reinforcement learning to long video reasoning in vision-language models. It provides a 104K-sample dataset called LongVideo-Reason with high-quality reasoning annotations across diverse domains, combined with a two-stage training pipeline that handles the computational challenges of long-sequence sequence parallelism. The work produces the LongVILA-R1-7B model, demonstrating effective RL scaling to extended multi-modal contexts.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.