kijai/ComfyUI-Florence2
A ComfyUI custom node for running Microsoft Florence2 vision-language model inference.

Velocity · 7d
+2.4
★ / day
Trend
→steady
star history
This repository provides a ComfyUI integration for the Florence2 vision foundation model, enabling tasks such as image captioning, object detection, semantic segmentation, and document visual question answering. It leverages the model’s sequence-to-sequence architecture to handle multiple vision and vision-language tasks through text prompts. Users can download official Florence2 base or large models, including fine-tuned variants, for use within the ComfyUI workflow.