← all repositories

Francis-Rings/StableAvatar

A video diffusion model that generates infinite-length avatar videos from a reference image and audio input.

1.2k stars Python Image · Video · Audio
StableAvatar
Velocity · 7d
+4.1
★ / day
Trend
steady
star history

StableAvatar is an end-to-end video diffusion transformer for synthesizing high-quality, infinite-length avatar videos driven by audio. It takes a reference image and audio as conditioning inputs to generate synchronized talking head videos without post-processing. The model uses a diffusion transformer architecture to handle both visual generation and temporal consistency across long video sequences.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.