yuanze-lin/Olympus
A universal vision-language foundation model that routes diverse computer vision tasks to specialized solutions.

Olympus is a CVPR 2025 Highlight paper implementing a universal task router for computer vision tasks using vision-language models. The system takes in various computer vision inputs and intelligently routes them to appropriate task-specific solutions. It leverages multimodal foundation models combining vision and language understanding for the routing decision. The implementation includes training and inference code along with model weights on Hugging Face.