← all repositories

Phantom-video/HuMo

HuMo is a human-centric video generation model that conditions on multiple modalities including reference images.

1.2k stars Python Image · Video · Audio
HuMo
Velocity · 7d
+4.6
★ / day
Trend
steady
star history

HuMo generates videos featuring humans by combining multiple conditioning signals from different modalities. The model uses collaborative multi-modal conditioning to synthesize video content given reference images and other inputs. This research project from Tsinghua University and ByteDance includes model weights on Hugging Face and a dataset of 670K video samples.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.