Soul-AILab/SoulX-LiveAct
A diffusion-model-based framework for real-time hour-scale human animation video generation on GPUs.

SoulX-LiveAct is a diffusion-model framework for generating lifelike human animation videos in real-time. It introduces Neighbor Forcing, a technique that leverages diffusion-step-aligned neighbor latents as an inductive bias for consistent autoregressive video generation. The framework also includes ConvKV Memory, a plug-in compression mechanism that enables constant-memory hour-scale video generation with minimal overhead. The system achieves 20 FPS inference on dual H100/H200 GPUs at 720x416 or 512x512 resolution using FP8 precision and operator fusion optimizations.