← all repositories

facebookincubator/AITemplate

A Python framework that compiles deep neural networks into high-performance CUDA/HIP C++ code for GPU inference.

4.7k stars Python Inference · Serving
AITemplate
Velocity · 7d
+3.3
★ / day
Trend
steady
star history

AITemplate transforms deep neural networks into optimized GPU kernels by compiling Python model definitions into portable CUDA/HIP C++ code. It achieves near-peak performance on FP16 TensorCore (NVIDIA) and MatrixCore (AMD) hardware without third-party runtime dependencies, generating self-contained binaries with advanced horizontal and vertical operator fusion capabilities.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.