meta-pytorch/monarch
A distributed programming framework for PyTorch based on scalable actor messaging for multi-GPU training.

Velocity · 7d
+2.6
★ / day
Trend
→steady
star history
Monarch provides a Python API for creating remote actors that can work with distributed tensors across processes and GPUs. It implements fault tolerance through supervision trees, point-to-point RDMA transfers for memory-efficient GPU communication, and scalable broadcast messaging across actor meshes. The framework enables imperative creation of training processes across multiple GPUs with built-in synchronization primitives.