m87-labs/moondream
A lightweight vision-language model that answers questions about images using natural language.

Velocity · 7d
+11
★ / day
Trend
→steady
star history
Moondream is a compact vision-language model designed to run on resource-constrained devices. It takes images as input and produces natural language descriptions or answers questions about visual content. The model uses a vision encoder to process images and an LLM backbone to generate text responses, making it suitable for edge deployments and applications requiring efficient multimodal inference.