← all repositories

radixark/miles

Enterprise reinforcement learning framework for LLM and VLM post-training.

miles
Velocity · 7d
+6.3
★ / day
Trend
steady
star history

Miles is an enterprise-grade reinforcement learning framework designed for post-training large language models and vision-language models. It provides high-performance rollout capabilities, supports training backends like Megatron and FSDP, and integrates with SGLang for inference optimization. The framework includes advanced features such as INT4 quantization-aware training for fitting large models into limited VRAM and unified multi-turn training pipelines for both VLM and LLM training.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.