Is CausVid open source?

Yes — tianweiy/CausVid is an open-source project tracked on heatdrop.

What language is CausVid written in?

tianweiy/CausVid is primarily written in Python.

How popular is CausVid?

tianweiy/CausVid has 1.4k stars on GitHub.

Where can I find CausVid?

tianweiy/CausVid is on GitHub at https://github.com/tianweiy/CausVid.

← all repositories

tianweiy/CausVid

Turning video diffusion into a real-time stream

CausVid exists because standard video diffusion wastes time processing future frames before it can output the first one.

★1.4k stars Python Image · Video · Audio

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does CausVid reengineers a pretrained bidirectional diffusion transformer—built on the Wan2.1 suite—into a fully causal, autoregressive generator. It replaces the usual global attention with a frame-by-frame causal mask, then distills the original 50-step diffusion process down to four steps using a video-aware extension of Distribution Matching Distillation. The system streams video at 9.4 FPS on a single GPU using KV caching, and it can synthesize long clips even though training was limited to short segments.

The interesting bit The authors avoid the usual autoregressive quality collapse by having a causal student learn from a bidirectional teacher that retains full temporal context, and they warm-start the student along the teacher’s own ODE trajectories. This asymmetric distillation keeps the model from drifting as it generates frame after frame without peeking ahead.

Key highlights

Runs autoregressively with causal attention, enabling true streaming generation and KV caching.
Distills 50-step diffusion into a 4-step generator via video DMD.
Reports a VBench-Long score of 84.27, surpassing prior models according to the authors.
Supports zero-shot streaming video-to-video translation, image-to-video, and dynamic prompting.
Built on the Wan2.1 backbone.

Caveats

The repo is flagged as a work in progress with frequent updates expected.
Distillation was trained on a small toy dataset (MixKit, roughly 6K videos); the authors note larger, higher-quality data is likely needed for best results.
Image-to-video checkpoints and cross-attention feature caching are still on the TODO list.

Verdict A solid starting point for researchers who want streaming video diffusion with a CVPR 2025 pedigree. Not yet the repo to grab if you need a stable, turnkey production pipeline.

Frequently asked

What is tianweiy/CausVid?: CausVid exists because standard video diffusion wastes time processing future frames before it can output the first one.
Is CausVid open source?: Yes — tianweiy/CausVid is an open-source project tracked on heatdrop.
What language is CausVid written in?: tianweiy/CausVid is primarily written in Python.
How popular is CausVid?: tianweiy/CausVid has 1.4k stars on GitHub.
Where can I find CausVid?: tianweiy/CausVid is on GitHub at https://github.com/tianweiy/CausVid.