← all repositories
NVIDIA/Dataset_Synthesizer

Synthetic training data, generated inside a game engine

NVIDIA's UE4 plugin renders labeled images so you don't have to label them by hand.

602 stars C++ Data Tooling
Dataset_Synthesizer
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

NDDS is an Unreal Engine 4 plugin that generates synthetic training data for computer vision models. It exports images alongside ground-truth metadata—segmentation masks, depth maps, 3D poses, bounding boxes, keypoints, and custom stencils—while randomizing lighting, textures, camera angles, and object placement to improve domain transfer.

The interesting bit

The randomization isn’t an afterthought; it’s the core research insight. NVIDIA showed that aggressively varying synthetic scenes lets models trained purely in simulation work on real-world data, sidestepping the cost and pain of hand-labeling expert annotations like 3D bounding box vertices.

Key highlights

  • Built as a native UE4 plugin, not a standalone renderer—leverages the full engine pipeline
  • Exports multiple annotation types simultaneously from a single scene capture
  • Includes camera path following and distractor object placement for richer scenes
  • Companion repo (NVDU) provides visualization utilities for the exported data
  • Backed by published ICRA/CVPR results on synthetic-to-real transfer

Caveats

  • UE4 4.22 has a known memory leak: material randomization with 10+ objects causes uniform buffer memory to balloon, and stopping play-in-editor may hang; workaround is restarting the editor or downgrading to UE4.21/NDDS v1.1
  • Vulkan backend avoids the memory issue but cannot capture depth or class segmentation
  • Requires git-lfs; the README warns explicitly against downloading as ZIP

Verdict

Worth a look if you’re doing pose estimation or object detection and lack labeled real-world data. Skip it if you’re already sitting on a large annotated dataset or if your team has no UE4 expertise—the plugin is powerful but not plug-and-play.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.