Osilly/Vision-R1
Vision-R1 is a multimodal LLM (7B–72B) trained with RL-based cold-start initialization to improve mathematical and visual reasoning.

Velocity · 7d
+2.7
★ / day
Trend
→steady
star history
Vision-R1 applies reinforcement learning techniques inspired by DeepSeek-R1 to multimodal large language models, introducing a cold-start initialization strategy to improve reasoning capabilities. The project releases model weights, training datasets (Vision-R1-cold and Vision-R1-rl), and demonstrates improved performance on math reasoning benchmarks including MathVista, MathVerse, and DynaMath across multiple model sizes.