antgroup/echomimic_v2
An audio-driven deep learning model that generates semi-body human animations and talking-head videos from audio input.

Velocity · 7d
+8.1
★ / day
Trend
→steady
star history
EchoMimicV2 is a neural network model for generating animated human videos from audio. It uses audio input to drive facial expressions, head movements, and upper body gestures to produce realistic talking-head and body animation sequences. The model is designed to simplify previous approaches while maintaining quality for portrait and semi-body animation generation.