Is cache-dit open source?

Yes — vipshop/cache-dit is open source, released under the Apache-2.0 license.

What language is cache-dit written in?

vipshop/cache-dit is primarily written in Python.

How popular is cache-dit?

vipshop/cache-dit has 1.2k stars on GitHub.

Where can I find cache-dit?

vipshop/cache-dit is on GitHub at https://github.com/vipshop/cache-dit.

← all repositories

vipshop/cache-dit

A one-line turbocharger for Diffusion Transformers

It wraps HuggingFace Diffusers to add caching, parallelism, and quantization to DiT pipelines without rewriting your inference code.

★1.2k stars Python Inference · Serving Image · Video · Audio

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Cache-DiT is an inference acceleration layer that sits on top of the HuggingFace Diffusers library. It intercepts standard Diffusion Transformer pipelines to apply hybrid cache strategies like DBCache and TaylorSeer, plus tensor and context parallelism, CPU offloading, and quantization. The goal is to make existing DiT models run faster and leaner on NVIDIA, AMD, and Ascend hardware without swapping out the underlying model code.

The interesting bit

The hook is ergonomic: a single call like cache_dit.enable_cache(pipe) can toggle on optimizations that normally require manual graph surgery. Rather than building a siloed framework, it has become a shared substrate—major tools like ComfyUI, SGLang, and vLLM-Omni have integrated it, suggesting the community treats it as a de facto standard acceleration layer for DiTs.

Key highlights

Claims up to 9× speedup when combining cache, context parallelism, and compilation.
Supports hybrid 2D/3D parallelism and dedicated parallel strategies for Text Encoders, VAEs, and ControlNets.
Offers experimental W4A4 post-training quantization via SVDQuant with configurable rank.
Runs natively on NVIDIA GPUs, AMD GPUs, and Ascend NPUs.
Adopted by downstream projects including ComfyUI, TensorRT-LLM, stable-diffusion.cpp, and Nunchaku.

Caveats

The SVDQuant workflow is explicitly marked experimental and requires building from source with a special environment flag.
The README does not detail how the claimed 9× speedup was measured or on which specific model and hardware.

Verdict

If you are already using Diffusers and need to push DiT inference faster without abandoning your pipeline code, this is worth evaluating. If you are not working with Diffusion Transformers or need training-time optimizations, this adds nothing for you.

Frequently asked

What is vipshop/cache-dit?: It wraps HuggingFace Diffusers to add caching, parallelism, and quantization to DiT pipelines without rewriting your inference code.
Is cache-dit open source?: Yes — vipshop/cache-dit is open source, released under the Apache-2.0 license.
What language is cache-dit written in?: vipshop/cache-dit is primarily written in Python.
How popular is cache-dit?: vipshop/cache-dit has 1.2k stars on GitHub.
Where can I find cache-dit?: vipshop/cache-dit is on GitHub at https://github.com/vipshop/cache-dit.