← all repositories

PKU-YuanGroup/MoE-LLaVA

A multi-modal large language model that uses Mixture-of-Experts architecture to efficiently handle vision-language tasks.

MoE-LLaVA
Velocity · 7d
+2.6
★ / day
Trend
steady
star history

MoE-LLaVA is a vision-language model that applies Mixture-of-Experts techniques to improve efficiency and performance in handling multi-modal inputs. The project implements sparse activation mechanisms where only a subset of expert networks are engaged per forward pass, enabling larger model capacity without proportional compute cost. It provides training code, pre-trained checkpoints, and interactive demos via HuggingFace and Replicate.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.