Is tensorrt-cpp-api open source?

Yes — cyrusbehr/tensorrt-cpp-api is open source, released under the MIT license.

What language is tensorrt-cpp-api written in?

cyrusbehr/tensorrt-cpp-api is primarily written in C++.

How popular is tensorrt-cpp-api?

cyrusbehr/tensorrt-cpp-api has 808 stars on GitHub.

Where can I find tensorrt-cpp-api?

cyrusbehr/tensorrt-cpp-api is on GitHub at https://github.com/cyrusbehr/tensorrt-cpp-api.

← all repositories

cyrusbehr/tensorrt-cpp-api

A no-throw C++ library that keeps TensorRT out of your headers

It wraps NVIDIA's C++ inference stack into a cache-safe, exception-free API with explicit CUDA streams and optional zero-copy Python bindings.

★808 stars C++ Inference · Serving Computer Vision

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

tensorrt_cpp_api ingests an ONNX model, builds or loads a cached TensorRT engine, and runs inference through a compact C++20 API. You feed it name-keyed tensors on caller-owned CUDA streams, pay for host/device copies explicitly, and receive errors as Status values rather than exceptions. The public headers expose no nvinfer1, OpenCV, or spdlog types—only a PImpl boundary and a generated build_config.h—so consumers need TensorRT at runtime, not compile time.

The interesting bit

The engine cache is keyed by ONNX content hash, build options, TensorRT version, and GPU UUID, with a JSON sidecar and atomic writes; a stale cache is rebuilt instead of silently reused. On the Python side, trtcpp bindings speak __cuda_array_interface__ and DLPack, release the GIL during inference, and benchmark within roughly 13% of the C++ path with no host round-trips.

Key highlights

No-throw Status/Result<T> error model; explicit host/device transfers on caller-provided Stream handles.
EnginePool leases execution contexts for multi-stream, dynamic-shape inference with per-input min/opt/max profiles.
Optional fused CUDA preprocessing kernel: letterbox-resize, color swap, normalize, HWC→NCHW, and cast in one shot.
Precision choices (FP16, Int8QDQ, FP8) fail loudly when unavailable instead of silently falling back.
cmake --install produces a find_package(tensorrt_cpp_api) target; optional trtcpp Python wheel via scikit-build-core.

Caveats

Linux-only, CUDA 12, TensorRT ≥ 10; CNN-style vision models are the target, and Windows or LLM/transformer features are explicitly out of scope.
The published latency figures come from an RTX 3080 Laptop GPU; expect different numbers on other hardware.

Verdict

Grab it if you want TensorRT’s throughput without the usual C++ API friction. Look elsewhere if you need Windows support or transformer inference.

Frequently asked

What is cyrusbehr/tensorrt-cpp-api?: It wraps NVIDIA's C++ inference stack into a cache-safe, exception-free API with explicit CUDA streams and optional zero-copy Python bindings.
Is tensorrt-cpp-api open source?: Yes — cyrusbehr/tensorrt-cpp-api is open source, released under the MIT license.
What language is tensorrt-cpp-api written in?: cyrusbehr/tensorrt-cpp-api is primarily written in C++.
How popular is tensorrt-cpp-api?: cyrusbehr/tensorrt-cpp-api has 808 stars on GitHub.
Where can I find tensorrt-cpp-api?: cyrusbehr/tensorrt-cpp-api is on GitHub at https://github.com/cyrusbehr/tensorrt-cpp-api.