← all repositories

ttengwang/Caption-Anything

A multi-model image captioning tool that combines Segment Anything, visual captioning, and ChatGPT for controllable text generation.

Caption-Anything
Velocity · 7d
+1.5
★ / day
Trend
steady
star history

Caption-Anything is an image processing system that leverages Segment Anything for visual segmentation, a captioning model for text generation, and ChatGPT for language-level control. Users can click on image regions to select objects, then generate descriptive captions with customizable style, length, sentiment, and factuality. The system also supports conversational follow-up about selected objects via ChatGPT integration.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.