← all repositories

NVIDIA/TransformerEngine

NVIDIA's library for accelerating Transformer model training and inference on GPUs using FP8/FP4 precision.

TransformerEngine
Velocity · 7d
+2.5
★ / day
Trend
steady
star history

TransformerEngine is a performance library that provides optimized transformer layers, attention backends, and fused kernels for NVIDIA GPUs including Hopper, Ada, and Blackwell architectures. It supports low-precision training and inference via FP8 and NVFP4 formats to reduce memory usage and increase throughput. It integrates with PyTorch, JAX, and major LLM frameworks including NeMo and Megatron.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.