OpenMOSS/MOVA
MOVA is a diffusion-based foundation model that generates synchronized video and audio content in a single inference pass.

Velocity · 7d
+7.9
★ / day
Trend
→steady
star history
MOVA breaks the “silent era” of open-source video generation by synthesizing video and audio simultaneously for perfect alignment. Unlike cascaded pipelines, it generates high-fidelity video and synchronized audio in a single inference pass, eliminating error accumulation. It achieves state-of-the-art performance in multilingual lip-synchronization and environment-aware sound effects, with full open-source availability.