← all repositories

EvelynFan/FaceFormer

A Transformer-based neural network that synthesizes realistic 3D facial motions from speech audio.

915 stars Python Image · Video · Audio
FaceFormer
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

FaceFormer is an end-to-end Transformer architecture that autoregressively generates sequences of 3D facial meshes from audio input. Given a neutral face template and raw audio, it produces accurate lip movements and facial expressions. The implementation is in PyTorch and includes pretrained models for VOCASET and BIWI datasets.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.