Is generative-models open source?

Yes — Stability-AI/generative-models is open source, released under the MIT license.

What language is generative-models written in?

Stability-AI/generative-models is primarily written in Python.

How popular is generative-models?

Stability-AI/generative-models has 27.2k stars on GitHub.

Where can I find generative-models?

Stability-AI/generative-models is on GitHub at https://github.com/Stability-AI/generative-models.

← all repositories

Stability-AI/generative-models

Stability AI's video-to-4D pipeline, now with fewer moving parts

A monorepo of diffusion models that turns flat images into orbiting 3D videos, and videos into 4D assets you can walk around.

★27.2k stars Python Image · Video · Audio Inference · Serving

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does This is Stability AI’s research release hub for generative video and 3D models. The current star is SV4D 2.0, which takes a short input video of a moving object and generates novel-view videos from multiple camera angles—effectively reconstructing 4D space (3D plus time). The repo also houses SV3D for image-to-3D-orbit synthesis, Stable Video Diffusion for image-to-video, and SDXL-Turbo for fast text-to-image.

The interesting bit SV4D 2.0 drops a dependency that hampered its predecessor: it no longer needs SV3D to generate reference multi-views of the first frame. That makes it more robust to self-occlusions and better at handling real-world videos with messy backgrounds. The trade-off is a hungrier GPU—576×576 resolution, 50 default sampling steps, and autoregressive generation for longer clips.

Key highlights

SV4D 2.0 generates 48 frames (12 video frames × 4 views) at 576×576; an 8-view variant exists for different use cases
Input can be GIF, MP4, or frame sequences; background removal via rembg, Clipdrop, or SAM2 is recommended for clean results
Low-VRAM fallback: set --encoding_t=1 --decoding_t=1 or drop to 512×512 resolution
SV3D_p variant accepts custom camera paths via elevation/azimuth degree sequences
Includes Streamlit and Gradio demos for local inference

Caveats

All models are tagged “for research purposes” with no commercial license mentioned in the README
Setup requires Python 3.10, CUDA 11.8, and a dependency pulled from a separate datapipelines repo
The 21-frame default input length for SV4D/SV4D 2.0 is arbitrary—scripts autoregressively extend from smaller native chunk sizes

Verdict Grab this if you’re doing novel-view synthesis, 4D reconstruction research, or need a reference implementation of diffusion-based video generation. Skip if you want production-ready APIs or lack VRAM headroom.

Frequently asked

What is Stability-AI/generative-models?: A monorepo of diffusion models that turns flat images into orbiting 3D videos, and videos into 4D assets you can walk around.
Is generative-models open source?: Yes — Stability-AI/generative-models is open source, released under the MIT license.
What language is generative-models written in?: Stability-AI/generative-models is primarily written in Python.
How popular is generative-models?: Stability-AI/generative-models has 27.2k stars on GitHub.
Where can I find generative-models?: Stability-AI/generative-models is on GitHub at https://github.com/Stability-AI/generative-models.