← all repositories
BerriAI/litellm

One OpenAI-shaped hole for 100+ LLMs

LiteLLM is the adapter layer that stops your codebase from fracturing across a dozen provider SDKs.

litellm
Velocity · 7d
+47
★ / day
Trend
steady
star history

What it does LiteLLM gives you a single OpenAI-compatible interface to call over 100 LLM providers — Anthropic, Bedrock, Azure, Gemini, Vertex, and the rest — either as a Python SDK or a self-hosted proxy server. You write one client; LiteLLM handles auth, request translation, and response normalization behind the scenes.

The interesting bit The proxy mode turns LiteLLM into an actual AI Gateway: virtual API keys, spend tracking, guardrails, load balancing, and an admin dashboard all ship out of the box. It also speaks newer protocols — A2A agents and MCP tools — so you can route not just model calls but agent-to-agent traffic and tool use through the same chokepoint.

Key highlights

  • Drop-in OpenAI compatibility: swap gpt-4o for anthropic/claude-sonnet-4-20250514 without touching client code
  • 8ms P95 latency at 1k RPS, per their benchmarks
  • Supports /chat/completions, /embeddings, /images, /audio, /batches, /rerank, /a2a, /messages, and more
  • MCP gateway mode lets Cursor IDE and other clients consume MCP servers through the proxy
  • OSS adopters include Stripe, Netflix, and OpenAI’s own Agents SDK

Caveats

  • The 8ms latency claim is self-reported; verify against your own load patterns
  • MCP client support is marked experimental_mcp_client — API may shift
  • Enterprise features (hosted proxy, SSO, etc.) sit behind a paid tier; the README funnels you toward it aggressively

Verdict If you’re running multi-provider LLM workloads in production and tired of maintaining N different client libraries, this is close to a default choice. Skip it if you’re locked to a single provider and don’t need gateway features — it’s overkill for a one-model shop.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.