thu-ml/DiT-Extrapolation
Official implementation of RIFLEx and UltraViCo research papers enabling length extrapolation for video and image diffusion transformers.

This repository provides plug-and-play implementations of research methods for extending diffusion transformer models to generate longer videos and higher resolution images. RIFLEx (ICML 2025) enables length extrapolation in video diffusion transformers, UltraViCo (ICLR 2026) extends to video synthesis with multiple base models, and UltraImage applies similar techniques to high-resolution image generation. The code integrates with popular diffusion frameworks including HuggingFace diffusers, CogVideoX, HunyuanVideo, Wan2.1, and Flux.