fudan-generative-vision/hallo3
A Video Diffusion Transformer that animates static portrait images into dynamic, realistic videos driven by reference inputs.

Velocity · 7d
+2.5
★ / day
Trend
→steady
star history
Hallo3 is a portrait image animation system that uses a Video Diffusion Transformer architecture to generate highly dynamic and realistic video sequences from a single portrait image. The model takes a driving input (such as another video or audio) to control the motion and expressions of the portrait subject. It represents a state-of-the-art approach to talking head generation and character animation published at CVPR 2025.