← all repositories

Gen-Verse/ReasonFlux

Open-source LLM post-training suite from Princeton and ByteDance featuring reasoning optimization via reinforcement learning and process reward models.

537 stars Python Language ModelsML Frameworks
ReasonFlux
Velocity · 7d
+1.1
★ / day
Trend
steady
star history

ReasonFlux is a comprehensive post-training framework for developing advanced LLM reasoning capabilities. It includes ReasonFlux-PRM for trajectory-aware process reward modeling, ReasonFlux-Coder for RL-based code generation with co-evolved unit testers, and ReasonFlux-Zero/F1 for hierarchical chain-of-thought reasoning via thought templates. The suite focuses on data selection, reinforcement learning, and inference scaling to improve long-CoT reasoning performance.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.