YOLO on Jetson: the full pipeline from quantized training to DeepStream
NVIDIA's reference repo for squeezing YOLOv7 onto edge hardware without rewriting your entire stack.

What it does
Four connected samples that take YOLOv4/v7 from PyTorch training through INT8 quantization to TensorRT inference and finally DeepStream deployment. The yolov7_qat folder handles Quantization-Aware Training with NVIDIA’s pytorch-quantization toolkit; tensorrt_yolov7 and tensorrt_yolov4 provide standalone C++ apps for engine benchmarking; deepstream_yolo wires the result into NVIDIA’s streaming analytics SDK with custom output-layer parsing.
The interesting bit
The QAT workflow is the real meat. The repo includes explicit rules for Q&DQ node placement (rules.py) and a guidance doc on performance optimization—acknowledging that where you stick quantization nodes matters more than whether you do it at all. On Jetson AGX Orin, their INT8 QAT/PTQ engines hit 264 FPS at batch-16 versus 162 for FP16, with the README noting only a small mAP drop.
Key highlights
- End-to-end: PyTorch QAT → ONNX export → TensorRT engine → DeepStream pipeline
- Standalone C++ TensorRT apps for YOLOv4 and YOLOv7 with image, video, and COCO validation modes
- DeepStream integration includes custom
nvdsparsebbox_Yolo.cppfor parsing YOLO’s detection output format - Performance table covers Jetson Orin-X and Tesla T4 with FP16 and INT8, single and multi-stream
cuda-post-processvscpu-post-processcomparison shows where the bottleneck actually lives
Caveats
- README grammar and formatting are rough; some sentences are unclear (“same performance of PTQ in TensorRT” is ambiguous)
- No explicit license mentioned in the provided README text
- DeepStream doesn’t support cudaGraph, so the trtexec numbers aren’t directly comparable to the streaming path
Verdict Grab this if you’re building a production YOLO pipeline on Jetson and need a working quantization reference—not a tutorial, but a working config to adapt. Skip if you just want a quick Python demo; this is C++ Makefile territory with NVIDIA SDK dependencies.