lm-sys/RouteLLM
An LLM routing framework that dynamically routes queries to cheaper or stronger models based on complexity.

RouteLLM provides a framework for serving and evaluating LLM routers that automatically direct queries to appropriate models. It acts as a drop-in replacement for OpenAI’s client, routing simpler queries to cheaper models while reserving stronger models for complex tasks. The framework includes pre-trained routers that claim to reduce costs by up to 85% while maintaining 95% of GPT-4 performance on benchmarks like MT Bench. It supports comparing router performance across multiple benchmarks and can be extended with new router implementations.