← all repositories

triton-inference-server/server

NVIDIA's open-source inference server for optimized deep learning model serving on GPU, cloud, and edge.

10.7k stars Python Inference · Serving
server
Velocity · 7d
+3.8
★ / day
Trend
steady
star history

Triton Inference Server is a production inference serving platform that optimizes model deployment across GPUs, CPUs, and edge devices. It supports multiple deep learning frameworks and backends, enabling low-latency inference for machine learning models in datacenter and edge environments.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.