XiaomiMiMo/MiMo-V2-Flash
MiMo-V2-Flash is a Mixture-of-Experts language model with 309B total and 15B active parameters optimized for reasoning, coding, and agentic tasks.

MiMo-V2-Flash is a foundation language model developed by Xiaomi that employs a Mixture-of-Experts architecture with 309B total parameters and 15B active parameters. It uses a hybrid attention architecture combining sliding window and global attention in a 5:1 ratio to balance long-context capability with inference efficiency, and incorporates Multi-Token Prediction for improved performance. The model is designed specifically for high-speed reasoning and agentic workflows, with API access available through Xiaomi’s platform.