← all repositories

jdh-algo/JoyVASA

A diffusion-based method for generating talking portrait and animal videos from audio, producing facial dynamics and head motion.

871 stars Python Image · Video · Audio
JoyVASA
Velocity · 7d
+1.5
★ / day
Trend
steady
star history

JoyVASA is a diffusion-based approach for audio-driven facial animation that generates realistic talking heads from audio input. It employs a decoupled facial representation framework with a two-stage pipeline: first extracting disentangled facial representations, then generating facial dynamics and head motion from audio. The method supports both human portraits and animal images, producing natural lip-sync and head movements.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.