← all repositories

RLHF-V/RLAIF-V

Open-source framework for aligning multimodal large language models using AI feedback, achieving GPT-4V-level trustworthiness.

RLAIF-V
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

RLAIF-V introduces a novel paradigm for training and aligning multimodal large language models using open-source AI feedback. The project provides a full pipeline including high-quality feedback data, online feedback learning algorithms, and pre-trained model weights (7B and 12B variants). The resulting models and training data are used by projects like MiniCPM-Llora3-V 2.5 for building competitive vision-language models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.