Is LLaVA open source?

Yes — haotian-liu/LLaVA is open source, released under the Apache-2.0 license.

What language is LLaVA written in?

haotian-liu/LLaVA is primarily written in Python.

How popular is LLaVA?

haotian-liu/LLaVA has 24.9k stars on GitHub.

Where can I find LLaVA?

haotian-liu/LLaVA is on GitHub at https://github.com/haotian-liu/LLaVA.

← all repositories

haotian-liu/LLaVA

Teaching LLMs to see without billion-dollar budgets

An open-source vision-language model that trains in a day and runs on modest hardware.

★24.9k stars Python Image · Video · Audio Language Models

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

LLaVA bolts a vision encoder onto large language models (LLaMA, Qwen, and others) so the resulting model can chat about images, answer visual questions, and follow complex instructions involving what’s on screen. The project ships training code, inference scripts, and a model zoo with checkpoints from 7B to 110B parameters.

The interesting bit

The core trick is visual instruction tuning: using GPT-4 to generate multimodal instruction-following data from image captions, then training the whole stack end-to-end. The original LLaVA-1.5 reportedly trains in ~1 day on a single 8×A100 node using only public data, yet matches or beats models trained on billion-scale datasets. A later variant, LLaVA-NeXT, processes 4× more pixels and apparently outperforms Gemini Pro on some benchmarks.

Key highlights

Supports LoRA fine-tuning with “comparable performance as full-model finetuning” and lower GPU RAM requirements
4-bit/5-bit quantization via llama.cpp; community reports running 13B models on 12 GB VRAM
Zero-shot video understanding in LLaVA-NeXT despite image-only training
RLHF-tuned variants (LLaVA-RLHF) for reduced hallucination
Extensive ecosystem: Colab notebooks, HuggingFace Spaces, AutoGen integration, biomedical spinoff (LLaVA-Med)

Caveats

The README warns non-Linux users off the default install path; macOS and Windows require separate docs
License stack is complicated: Apache 2.0 for the code, but model checkpoints inherit Llama/Qwen/OpenAI dataset terms, so commercial use depends on which base model you pick

Verdict

Worth a look if you need an open, hackable vision-language model you can actually train and deploy without a corporate cluster. Skip it if you want a polished API product with clean liability lines.

Frequently asked

What is haotian-liu/LLaVA?: An open-source vision-language model that trains in a day and runs on modest hardware.
Is LLaVA open source?: Yes — haotian-liu/LLaVA is open source, released under the Apache-2.0 license.
What language is LLaVA written in?: haotian-liu/LLaVA is primarily written in Python.
How popular is LLaVA?: haotian-liu/LLaVA has 24.9k stars on GitHub.
Where can I find LLaVA?: haotian-liu/LLaVA is on GitHub at https://github.com/haotian-liu/LLaVA.