← all repositories

OthersideAI/self-operating-computer

A framework that lets multimodal AI models autonomously operate computers by viewing screens and executing mouse/keyboard actions.

10.2k stars Python AgentsCoding Assistants
self-operating-computer
Velocity · 7d
+11
★ / day
Trend
steady
star history

This framework enables AI models to control computers using the same inputs and outputs as humans—the model observes the screen and decides on a sequence of mouse and keyboard actions to accomplish objectives. It integrates with multiple multimodal models including GPT-4o, Claude 3, Gemini Pro Vision, and others. The system uses automation tools like PyAutoGUI to execute the decided actions, making it one of the early full computer-use examples released in late 2023.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.