harlanhong/ACTalker
An ICCV 2025 end-to-end video diffusion framework for talking head synthesis supporting audio and expression control.

Velocity · 7d
+1.0
★ / day
Trend
→steady
star history
ACTalker is a video diffusion model for generating realistic talking head videos from audio and expression signals. It employs masked selective state spaces modeling to achieve audio-visual synchronization and natural motion in avatar animation. The framework supports both single and multi-signal control for digital human synthesis and is intended for high-quality video generation of talking avatars.