scaleapi/llm-engine
An open-source Python library and Helm chart for fine-tuning and serving foundation models including LLaMA, MPT, and Falcon.

Velocity · 7d
+0.8
★ / day
Trend
→steady
star history
LLM Engine provides APIs and infrastructure for deploying and serving open-source foundation models, supporting both Scale’s hosted platform and self-hosted Kubernetes deployments. It offers fine-tuning capabilities on custom data, streaming inference, and dynamic batching for optimized throughput and latency. The library integrates with Hugging Face models and provides CLI tooling for model management.