bytedance/UI-TARS-desktop
UI-TARS Desktop is a native desktop application providing a GUI agent based on the UI-TARS multimodal vision-language model for computer automation.

Velocity · 7d
+72
★ / day
Trend
→steady
star history
The repository provides a multimodal AI agent stack comprising Agent TARS (CLI/Web UI agent) and UI-TARS Desktop (native desktop GUI agent). It leverages cutting-edge multimodal LLMs and vision models to automate computer tasks through a GUI agent that can interact with desktop interfaces. The stack integrates with MCP tools and is built on the open-source UI-TARS vision-language model.