← all repositories

Gen-Verse/dLLM-RL

TraceRL is a reinforcement learning framework for training and post-training discrete diffusion large language models.

dLLM-RL
Velocity · 7d
+1.8
★ / day
Trend
steady
star history

The repository provides an official implementation for training diffusion-based LLMs using reinforcement learning techniques. It supports a wide range of discrete diffusion language models including TraDo, SDAR, Dream, LLaDA, MMaDA, LLaDA-V, and Diffu-Coder. The framework enables post-training via SFT, RL with optional value models and process rewards, and RLHF across diverse settings for mathematical reasoning, code generation, and multimodal tasks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.