bytedance/lightseq
A CUDA library for high-speed training and inference of transformer-based sequence models with int8/fp16 mixed-precision support.

Velocity · 7d
+1.4
★ / day
Trend
→steady
star history
LightSeq provides optimized CUDA kernels for transformer operations used in NLP sequence processing tasks including machine translation, BERT, and GPT. The library offers both training acceleration and high-throughput inference with support for int8 and fp16 mixed-precision computation. It integrates with Fairseq and Hugging Face frameworks for model deployment and supports beam search, diverse decoding, and sampling strategies.