Is HugeCTR open source?

Yes — NVIDIA-Merlin/HugeCTR is open source, released under the Apache-2.0 license.

What language is HugeCTR written in?

NVIDIA-Merlin/HugeCTR is primarily written in C++.

How popular is HugeCTR?

NVIDIA-Merlin/HugeCTR has 1.1k stars on GitHub.

Where can I find HugeCTR?

NVIDIA-Merlin/HugeCTR is on GitHub at https://github.com/NVIDIA-Merlin/HugeCTR.

← all repositories

NVIDIA-Merlin/HugeCTR

NVIDIA's GPU recommender engine for ads that click

A C++ framework that trains massive click-through-rate models on GPUs without pretending the embedding layer is someone else's problem.

★1.1k stars C++ Domain Apps ML Frameworks

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

HugeCTR trains and runs inference on large deep-learning recommender models—think ads, feeds, anything with a sparse embedding table that eats your RAM for breakfast. It exposes a Python API, but the heavy lifting is C++ and CUDA under the hood. You define a model graph in Python, feed it Parquet data, and it handles the GPU orchestration, multi-node NCCL chatter, and mixed-precision math.

The interesting bit

The framework treats “very large embedding” as a first-class citizen, not an afterthought. It ships model-parallel training and a separate Sparse Operation Kit so you can extract just the embedding guts if you don’t want the full framework. That’s the part most generic DL frameworks make you duct-tape together yourself.

Key highlights

Python frontend over C++/CUDA backend; claims MLPerf benchmark presence (no numbers in README)
Model-parallel training, multi-node via NCCL, mixed precision
ONNX export for trained models
Sparse Operation Kit: standalone GPU-accelerated sparse ops for external use
Docker-based workflow; users build images from provided Dockerfiles since v25.03

Caveats

As of version 25.03, NVIDIA only ships Dockerfiles—you build the image yourself, no prebuilt container
README notes that evaluation AUC will be “incorrect” with the synthetic demo data, which is honest but also means the quickstart doesn’t validate model quality
The “Fast” claim cites benchmarks but provides no actual throughput or latency figures in the README

Verdict

Worth a look if you’re running recommender training at scale and already live in NVIDIA’s ecosystem. Skip it if you need CPU fallback, non-NVIDIA GPUs, or a quick pip-install experience.

Frequently asked

What is NVIDIA-Merlin/HugeCTR?: A C++ framework that trains massive click-through-rate models on GPUs without pretending the embedding layer is someone else's problem.
Is HugeCTR open source?: Yes — NVIDIA-Merlin/HugeCTR is open source, released under the Apache-2.0 license.
What language is HugeCTR written in?: NVIDIA-Merlin/HugeCTR is primarily written in C++.
How popular is HugeCTR?: NVIDIA-Merlin/HugeCTR has 1.1k stars on GitHub.
Where can I find HugeCTR?: NVIDIA-Merlin/HugeCTR is on GitHub at https://github.com/NVIDIA-Merlin/HugeCTR.