BAAI-Agents/Cradle
An agent framework enabling foundation models to control computers through screenshots and keyboard/mouse operations for General Computer Control tasks.

Cradle is a framework designed for General Computer Control (GCC), empowering foundation models to perform complex computer tasks through a unified interface. It uses screenshots as visual input and generates keyboard and mouse operations as output, allowing agents to interact with software applications, games, and productivity tools. The framework provides reasoning capabilities, self-improvement mechanisms, and skill curation for agents operating in diverse digital environments.