deepspeedai/DeepSpeed-MII
DeepSpeed-MII is a library for low-latency, high-throughput inference serving of deep learning models including LLMs and diffusion models.

Velocity · 7d
+1.4
★ / day
Trend
→steady
star history
DeepSpeed-MII provides optimized inference capabilities powered by the DeepSpeed framework, enabling efficient serving of large language models and other deep learning models. It supports various models including Mixtral, Phi-2, Falcon, and Stable Diffusion. The library focuses on reducing latency and increasing throughput for production inference workloads through DeepSpeed’s inference optimization technologies.