roboflow/inference
A computer vision model inference server that runs on any hardware via Docker, supporting YOLO, SAM, CLIP, Florence-2 and other vision foundation models.

Inference is a self-hosted computer vision inference platform that can deploy fine-tuned models and foundation models (Florence-2, CLIP, SAM2) on any device including edge hardware like NVIDIA Jetson. It provides a Docker-based inference server supporting ONNX and TensorRT runtimes, workflow orchestration for chaining vision tasks, camera and video stream management, and built-in traditional CV methods like OCR and barcode reading.