vita-epfl/Stable-Video-Infinity
Research project presenting a video diffusion transformer that generates extended videos via iterative error recycling across overlapping windows.

This repository implements Stable Video Infinity, a video generation method using diffusion transformers to produce arbitrarily long videos. The approach cycles overlapping generated segments back into the denoising process to correct accumulated errors, enabling coherent multi-segment video generation. It supports applications including audio-driven talking faces and dance generation, and provides model weights on HuggingFace.