alxndrTL/mamba.py
A pure PyTorch and MLX implementation of the Mamba state space model architecture for language modeling.

This repository provides a clean, efficient implementation of Mamba in pure PyTorch with parallel scan for faster training and inference. It includes support for related architectures like Jamba (Mamba + attention hybrid) and Vision Mamba, as well as muP (maximal update parameterization) for better hyperparameter transfer across model scales. The implementation is now integrated into the Hugging Face transformers library and supports both CUDA and MLX backends.