← all repositories

THUDM/SwissArmyTransformer

A PyTorch library providing shared backbone code for building and training Transformer variants including BERT, GPT, T5, GLM, CogView, and ViT.

1.1k stars Python ML FrameworksLanguage Models
SwissArmyTransformer
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

SwissArmyTransformer is a framework for developing custom Transformer models that share a unified architecture backbone. It supports various model architectures such as BERT, GPT, T5, GLM, CogView, and ViT through lightweight mixin components. The library integrates DeepSpeed ZeRO and model parallelism to enable efficient pretraining and fine-tuning of large-scale models ranging from 100M to 20B parameters.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.