AI-Hypercomputer/maxtext
A high-performance, scalable open-source LLM library written in JAX for training models on TPUs and GPUs.

MaxText provides a library of transformer-based large language models including Gemma, Llama, DeepSeek, Qwen, and Mistral. It supports pre-training at scale (tens of thousands of chips) and post-training techniques like Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), and Group Sequence Policy Optimization (GSPO). The library achieves high Model FLOPs Utilization through JAX and XLA compiler optimizations while remaining simple and largely optimization-free.