zjhellofss/KuiperInfer
A C++ deep learning inference framework implementing operators like convolution, pooling, and activation for models including Llama2, Yolo, and ResNet.

KuiperInfer is an open-source course project that guides users through building a high-performance deep learning inference engine from scratch in C++. It implements core operators including convolution, max pooling, ReLU, and sigmoid, and supports popular model architectures such as Llama2, YoloV5, ResNet, and UNet. The project has expanded to include a new course covering LLM inference with CUDA acceleration and Int8 quantization for Llama and Qwen series models.