← all repositories

Lightning-AI/LitServe

LitServe is a Python framework for deploying custom AI inference servers with control over batching, streaming, and scaling.

3.9k stars Python Inference · Serving
LitServe
Velocity · 7d
+4.3
★ / day
Trend
steady
star history

LitServe is a minimal Python framework for building custom AI inference servers without MLOps glue code or configuration files. It provides full control over inference logic, batching, routing, and streaming for models, agents, RAG systems, and pipelines. The framework supports any PyTorch model, integrates with vLLM, and offers multi-GPU autoscaling with serverless deployment options.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.