NVlabs/QeRL
QeRL enables reinforcement learning training for 32B parameter LLMs on a single H100 GPU by combining NVFP4 quantization with LoRA.

Velocity · 7d
+2.0
★ / day
Trend
→steady
star history
QeRL is a quantization-enhanced reinforcement learning framework for large language models that addresses the resource intensity of RL training. It combines NVFP4 quantization with Low-Rank Adaptation (LoRA) to accelerate the rollout phase of RL and reduce memory overhead, enabling training of 32B parameter models on consumer-grade hardware.