SeldonIO/MLServer
An open source inference server for machine learning models, providing REST and gRPC APIs compliant with the KFServing V2 protocol.

MLServer is a Python-based inference server that lets you serve ML models through a REST and gRPC interface. It supports multi-model serving, parallel inference, and adaptive batching for performance optimization. The server is designed to run in Kubernetes environments and integrates with Seldon Core and KServe as the underlying model serving infrastructure. It is compliant with the standardized V2 Inference Protocol used across multiple ML serving frameworks.