← all repositories

OpenBMB/AgentCPM-GUI

An on-device GUI agent based on MiniCPM-V that autonomously operates Android apps using smartphone screenshots as input.

1.4k stars Python Agents
AgentCPM-GUI
Velocity · 7d
+3.5
★ / day
Trend
steady
star history

AgentCPM-GUI is an open-source 8-billion-parameter vision-language agent jointly developed by THUNLP, Renmin University of China, and ModelBest. Built on MiniCPM-V, it accepts smartphone screenshots as input and autonomously executes user-specified tasks on Android apps. The system uses reinforcement fine-tuning (RFT) to enhance planning and reasoning capabilities, enabling the model to think before outputting actions. It supports both Chinese and English apps with an optimized action space using concise JSON format, reducing average action length to 9.7 tokens for efficient on-device inference.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.