lucidrains/nuwa-pytorch
PyTorch implementation of NÜWA, a transformer-based attention network for text-to-video generation.

Velocity · 7d
+0.3
★ / day
Trend
→steady
star history
Implements the NÜWA model for text-to-video synthesis using a transformer-based attention architecture and VQGanVAE for encoding. The project extends to video and audio generation with a dual decoder approach, leveraging hierarchical causal transformers and quantized codebooks for efficient generative capabilities.