← all repositories

Coobiw/MPP-LLaVA

A multimodal pipeline parallel training framework for Qwen-based large language models supporting image, video, and multi-image inputs.

MPP-LLaVA
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

This repository provides a distributed training system for multimodal large language models based on Qwen-LM, enabling fine-tuning of 8B/14B models on consumer GPUs like RTX3090/4090 with 24GB memory. It implements pipeline parallelism (PP) combined with data parallelism (DP) using DeepSpeed, supporting supervised fine-tuning on image, video, and multi-image conversational data. The framework enables training LLaVA-like multimodal LLMs without requiring expensive enterprise hardware.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.