Lightning-AI/LitServe
LitServe is a Python framework for deploying custom AI inference servers with control over batching, streaming, and scaling.

Velocity · 7d
+4.3
★ / day
Trend
→steady
star history
LitServe is a minimal Python framework for building custom AI inference servers without MLOps glue code or configuration files. It provides full control over inference logic, batching, routing, and streaming for models, agents, RAG systems, and pipelines. The framework supports any PyTorch model, integrates with vLLM, and offers multi-GPU autoscaling with serverless deployment options.