← all repositories

EvolvingLMMs-Lab/open-r1-multimodal

A fork of open-r1 that adds multimodal RL training support for vision-language models using the GRPO algorithm.

1.6k stars Python Language ModelsML Frameworks
open-r1-multimodal
Velocity · 7d
+3.1
★ / day
Trend
steady
star history

This repository extends the open-r1 project to support multimodal reasoning model training. It implements the GRPO (Group Relative Policy Optimization) algorithm for training vision-language models like Qwen2-VL and Aria-MoE on math reasoning tasks. The project provides open-sourced training datasets with reasoning paths and verifiable answers, trained model checkpoints, and scripts for creating custom multimodal reasoning data.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.