← all repositories

NVIDIA/FasterTransformer

NVIDIA's optimized transformer inference library for BERT and GPT models on Volta, Turing, and Ampere GPUs.

FasterTransformer
Velocity · 7d
+3.4
★ / day
Trend
steady
star history

FasterTransformer provides highly optimized encoder and decoder transformer components for inference on NVIDIA GPUs. It supports BERT and GPT model families and integrates with PyTorch and TensorFlow. The library has transitioned development to TensorRT-LLM but remains available for existing use cases.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.