Tongyi-MAI/MAI-UI
A family of foundation GUI agent models ranging from 2B to 235B parameters designed for real-world autonomous interface interaction.

MAI-UI provides GUI agent foundation models trained with dynamic reinforcement learning scaling across up to 512 parallel environments. The framework enables agents to interact with users and execute tasks via MCP (Model Context Protocol) tools, supporting both on-device and cloud-based execution based on data sensitivity and task requirements. It achieves state-of-the-art results on GUI grounding and navigation benchmarks across multiple model scales.