← all repositories

NVIDIA/Megatron-LM

GPU-optimized library for training large transformer models at scale with advanced parallelism strategies.

Megatron-LM
Velocity · 7d
+6.3
★ / day
Trend
steady
star history

This repository contains Megatron-LM and Megatron Core, frameworks for distributed training of transformer models. Megatron-LM provides pre-configured training scripts for research teams, while Megatron Core offers composable GPU-optimized building blocks including transformer architectures, advanced parallelism strategies (tensor, pipeline, data, expert, and context parallelism), and mixed precision support (FP16, BF16, FP8, FP4) for custom training pipelines.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.