Skipping ONNX, going straight to TensorRT
A converter that takes MMDetection models directly to TensorRT without the usual ONNX detour.

What it does
Takes models from the MMDetection ecosystem and compiles them into TensorRT engines. Supports FP16, INT8 (experimental), batched inputs, and dynamic shapes. Also wraps the converted engine so you can run inference through the standard MMDetection API.
The interesting bit
Most converters go PyTorch → ONNX → TensorRT. This one goes PyTorch → TensorRT directly, which the author claims avoids “unnecessary ONNX IR.” That’s a real fork in the road — you’re either saving yourself a headache or trading one set of edge cases for another.
Key highlights
- Supports a long list of detectors: Faster R-CNN, YOLOX, DETR, FCOS, RetinaNet, SSD, and many more
- Mask R-CNN and Cascade Mask R-CNN marked as experimental
- Includes C++ inference demo and DeepStream support
- Docker image provided for containerized conversion
- Requires companion repos:
torch2trt_dynamicandamirstan_plugin(needs manual CMake build)
Caveats
- Some models only tested on MMDetection < 3.0; the 3.x support is relatively recent (Feb 2024)
- Installation is multi-repo and involves setting environment variables manually — not a one-liner
- INT8 and mask support both flagged as experimental
Verdict
Worth a look if you’re already committed to MMDetection and want to shave the ONNX step off your deployment pipeline. If you’re not already in that ecosystem, the setup tax is steep.