← all repositories

lemony-ai/cascadeflow

A cascading runtime that routes AI agent requests across multiple LLM providers to optimize cost, latency, and quality.

2.5k stars Python AgentsLLMOps · Eval
cascadeflow
Velocity · 7d
+11
★ / day
Trend
steady
star history

CascadeFlow is an intelligent runtime layer for AI agents that automatically routes LLM requests across providers such as OpenAI, Anthropic, Google, Ollama, and others based on cost, latency, and quality requirements. It implements model cascading strategies where simpler tasks are handled by cheaper models while complex requests escalate to more capable models. The system includes built-in policy controls and budget management for optimizing LLM spending in production agent workflows.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.