← all repositories

microsoft/torchscale

Microsoft's PyTorch library for scaling Transformer architectures and developing foundation models including BitNet, RetNet, and LongNet.

3.1k stars Python Language ModelsML Frameworks
torchscale
Velocity · 7d
+2.4
★ / day
Trend
steady
star history

TorchScale provides foundational architectures for large language models and multimodal systems. It implements research innovations such as DeepNet for stable deep scaling, Magneto for general-purpose modeling across language/vision/speech, and BitNet/RetNet as potential Transformer successors. The library focuses on training stability, modeling capability, and efficiency through sparse Mixture-of-Experts and long-context length extrapolation techniques.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.