← all repositories

ome-projects/ome

A Kubernetes operator that automates LLM deployment, GPU resource scheduling, and runtime selection for enterprise model serving.

ome
Velocity · 7d
+1.2
★ / day
Trend
steady
star history

OME (Open Model Engine) provides enterprise-grade management and serving of Large Language Models on Kubernetes. It treats models as first-class custom resources, automatically extracting architecture and parameter information from model files. The operator intelligently matches models to optimal runtimes like vLLM, SGLang, TensorRT-LLM, and Triton based on architecture scoring, while handling distributed storage, multi-format support (SafeTensors, PyTorch, TensorRT, ONNX), and GPU scheduling across clusters.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.