← all repositories

zhaochen0110/Awesome_Think_With_Images

A curated collection of research papers and resources on LVLMs leveraging visual information for complex reasoning, planning, and generation.

Awesome_Think_With_Images
Velocity · 7d
+3.9
★ / day
Trend
steady
star history

This repository accompanies a survey paper on multimodal reasoning with images. It systematically curates research on how Large Vision-Language Models can use visual information as a dynamic cognitive workspace for reasoning, planning, and generation tasks. The collection is structured around key themes in the evolving field of multimodal AI.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.