← all repositories

deepspeedai/DeepSpeed-MII

DeepSpeed-MII is a library for low-latency, high-throughput inference serving of deep learning models including LLMs and diffusion models.

DeepSpeed-MII
Velocity · 7d
+1.4
★ / day
Trend
steady
star history

DeepSpeed-MII provides optimized inference capabilities powered by the DeepSpeed framework, enabling efficient serving of large language models and other deep learning models. It supports various models including Mixtral, Phi-2, Falcon, and Stable Diffusion. The library focuses on reducing latency and increasing throughput for production inference workloads through DeepSpeed’s inference optimization technologies.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.