Oneflow-Inc/libai
A distributed training toolbox based on OneFlow for large-scale deep learning model parallelism across data, tensor, and pipeline dimensions.

LiBai is a large-scale open-source model training framework built on OneFlow. It provides multiple parallelism strategies including Data Parallelism, Tensor Parallelism, and Pipeline Parallelism, along with training techniques such as mixed precision training, activation checkpointing, gradient accumulation, and ZeRO optimizer. The toolbox supports both computer vision and NLP tasks with predefined datasets like CIFAR, ImageNet, and BERT.