Is DimensionX open source?

Yes — wenqsun/DimensionX is open source, released under the Apache-2.0 license.

What language is DimensionX written in?

wenqsun/DimensionX is primarily written in Python.

How popular is DimensionX?

wenqsun/DimensionX has 1.3k stars on GitHub.

Where can I find DimensionX?

wenqsun/DimensionX is on GitHub at https://github.com/wenqsun/DimensionX.

← all repositories

wenqsun/DimensionX

One image in, explorable 3D world out — via video diffusion

DimensionX turns a single photo into controllable camera videos, then extracts actual 3D/4D scenes by treating space and time as separate knobs.

★1.3k stars Python Image · Video · Audio Creative · Design

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does DimensionX starts with one image and generates videos with precise camera control — orbit left, pan up, dolly in, etc. — using fine-tuned LoRA adapters on top of CogVideoX. From those generated frame sequences, it reconstructs explorable 3D scenes (via Gaussian Splatting) and 4D scenes (time-varying 3D) by treating spatial motion and temporal dynamics as independently controllable factors.

The interesting bit The authors’ “ST-Director” decouples space and time in video diffusion instead of letting the model freely mix them. This is the boring-sounding trick that makes the whole pipeline work: by forcing the model to learn separate “directors” for camera motion and scene dynamics, they can recombine them to recover geometry that actually holds up under novel viewpoints.

Key highlights

12 camera LoRAs covering 6 degrees of freedom (± each axis), plus 4 orbital presets
360° orbit model generates 145 frames (~18s) in ~6 min on a single A6000 (30.5 GB VRAM)
Standard LoRA inference: ~3 min on A100/A800 for 48-frame, 6-second clips (26.3 GB VRAM)
Full training pipeline, datasets, and 3DGS optimization code open-sourced as of Oct 2025
Built on CogVideoX-5B-I2V; authors note HunyuanVideo and WanX as alternative bases

Caveats

4D generation’s “identity-preserving denoising code” is still on the todo list; not yet released
README setup involves multiple conda envs, manual checkpoint shuffling, and SAT/THUDM-specific weight formats
Inference “better result” note suggests running a VLM captioning pass first — not quite one-click

Verdict Worth a look if you’re doing neural rendering, single-view reconstruction, or controllable video generation research. Skip if you need polished 4D output today or lack the GPU headroom for 30+ GB models.

Frequently asked

What is wenqsun/DimensionX?: DimensionX turns a single photo into controllable camera videos, then extracts actual 3D/4D scenes by treating space and time as separate knobs.
Is DimensionX open source?: Yes — wenqsun/DimensionX is open source, released under the Apache-2.0 license.
What language is DimensionX written in?: wenqsun/DimensionX is primarily written in Python.
How popular is DimensionX?: wenqsun/DimensionX has 1.3k stars on GitHub.
Where can I find DimensionX?: wenqsun/DimensionX is on GitHub at https://github.com/wenqsun/DimensionX.