← all repositories

bentoml/BentoML

A Python library for building and deploying model inference APIs and multi-model serving systems.

BentoML
Velocity · 7d
+3.3
★ / day
Trend
steady
star history

BentoML provides tools to turn any AI/ML model inference script into a REST API server with minimal code. It handles dependency management, Docker image generation, and deployment reproducibility through simple config files. The framework includes built-in serving optimizations such as dynamic batching and model parallelism to maximize CPU and GPU utilization for high-performance inference workloads.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.