← all repositories

godweiyang/NN-CUDA-Example

Examples showing how to write custom CUDA kernels and call them from PyTorch and TensorFlow for neural network operations.

1.5k stars Python ML FrameworksLearning
NN-CUDA-Example
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

This repository provides examples of writing custom CUDA operators and integrating them with popular neural network toolkits. It demonstrates three compilation methods (JIT, setuptools, CMake) for building CUDA kernels and their C++ wrappers. The repository includes Python code for timing comparisons between custom CUDA kernels and framework operations, as well as model training examples using custom kernels. It covers both PyTorch and TensorFlow integration patterns.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.