PaddlePaddle/PaddleMIX
PaddleMIX is a PaddlePaddle-based multimodal AI framework providing diffusion models and vision-language models for image, video, and text generation tasks.

Velocity · 7d
+0.7
★ / day
Trend
→steady
star history
PaddleMIX offers a comprehensive multimodal model library including end-to-end large-scale vision-language models (LLaVA, Qwen2-VL, DeepSeek-VL, InternVL2, MiniCPM-V) and a diffusion model toolbox (ppdiffusers) for text-to-image, text-to-video, and image-to-text generation. It provides full-pipeline development tools and high-performance distributed training capabilities for multimodal AI tasks.