chengzeyi/stable-fast
A high-performance inference optimization framework for diffusion models using PyTorch, CUDA, and OpenAI Triton on NVIDIA GPUs.

Velocity · 7d
+1.4
★ / day
Trend
→steady
star history
Stable Fast is an inference acceleration framework that achieves state-of-the-art performance for all HuggingFace Diffusers pipelines including the latest Stable Video Diffusion. Unlike TensorRT or AITemplate which require lengthy compilation, it compiles models in seconds. It natively supports dynamic shapes, LoRA adapters, and ControlNet while targeting NVIDIA GPUs.