siliconflow/onediff
An acceleration library for running diffusion models faster on GPU via optimized CUDA kernels and graph compilation.

OneDiff provides out-of-the-box performance optimization for diffusion models including Stable Diffusion, SDXL, SDXL-Turbo, and Stable Video Diffusion. It integrates with popular inference frameworks like diffusers, ComfyUI, and SD-WebUI, offering optimized CUDA kernels and compilation passes to reduce latency and increase throughput. The library supports popular techniques like LCM (Latent Consistency Models) and LoRA adapters for fine-tuned models.