← all repositories

LLaVA-VL/LLaVA-Plus-Codebase

A multimodal LLM that learns to plug in and use various tools/skills for general vision tasks.

767 stars Python AgentsLanguage Models
LLaVA-Plus-Codebase
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

LLaVA-Plus extends large multimodal models with the ability to use tools across different modalities. It enables vision-language assistants to plug into external models, APIs, and skills to complete complex tasks. The codebase supports training, evaluation, and deployment of these tool-augmented multimodal agents, with support for various vision tasks via integrated external tools.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.