← all repositories

bentoml/OpenLLM

OpenLLM serves open-source LLMs as OpenAI-compatible APIs with support for Llama, DeepSeek, Qwen, and other models via a single command.

OpenLLM
Velocity · 7d
+11
★ / day
Trend
steady
star history

OpenLLM is an open-source LLM serving framework that enables developers to run any open-source LLMs as OpenAI-compatible REST APIs with a single command. It provides built-in support for state-of-the-art inference backends, a built-in chat UI, and streamlined deployment to cloud platforms via Docker, Kubernetes, and BentoCloud. The project supports a wide range of models including Llama 3.3, DeepSeek, Qwen2.5, Phi3, Gemma2, and Mistral.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.