kwsong0113/diffusion-forcing-transformer
A video diffusion model that generates videos conditioned on arbitrary context frames using a novel Diffusion Forcing Transformer architecture.

Velocity · 7d
+1.4
★ / day
Trend
→steady
star history
Implements History-Guided Video Diffusion, introducing the Diffusion Forcing Transformer (DFoT) that conditions on multiple context frames for video generation. Includes History Guidance (HG) methods that improve video quality, temporal consistency, and motion dynamics while enabling new capabilities like compositional video generation and stable long video rollout. Published at ICML 2025.