← all repositories
lycorp-jp/sim-use

A CLI that lets your LLM tap and swipe through mobile simulators

sim-use translates mobile simulator screens into token-efficient outlines so AI agents can tap, swipe, and type by element alias instead of raw coordinates.

526 stars Swift Coding AssistantsAgents
Feature · 04 Jul 2026
The CLI That Lets AI Agents Physically Use Your Phone

A new open-source CLI treats mobile simulators as Unix peripherals so LLMs can finally press buttons, type text, and verify what happens.

Read the in-depth article
sim-use
Collecting fresh signals — velocity needs a few days of history.
collecting data…
star history

What it does

sim-use is a macOS CLI that bridges AI agents to iOS Simulators and Android devices through a single command surface. It walks the full accessibility tree—including WebViews, system overlays, and embedded content—and condenses the screen into a compact text outline an LLM can parse in a few hundred tokens. Agents act back using cached element aliases (@N), stable IDs, or labels, so they can observe, tap, swipe, and verify without touching raw coordinates.

The interesting bit

The project treats accessibility APIs as an agent-native control plane rather than a testing afterthought. It bundles a skill file for Claude and others, emits structured JSON envelopes with actionable error hints, and keeps a per-device background daemon warm so that observe-act loops settle around ~300 ms after the first call.

Key highlights

  • One CLI surface covers both iOS Simulator and Android emulator/devices; the backend switches automatically based on device ID shape.
  • The ui output is roughly 16× more compact than raw accessibility JSON, which keeps LLM context windows cheap.
  • Alias-cached taps (@N) let an agent reference elements from its last observation without fragile coordinates.
  • A per-device daemon amortizes setup cost, and iOS batch mode reuses a single HID session across multiple steps to cut round-trip overhead.
  • The paste command works around HID keycode limitations by using the simulator pasteboard, handling CJK, emoji, and diacritics that raw key injection cannot.

Caveats

  • macOS 14+ only, and building from source requires compiling large XCFrameworks derived from Meta’s idb.
  • Android streaming video is not supported, and some iOS pasteboard flows require manual approval of the “Allow Paste” prompt per app session.
  • Batch chaining and low-level HID key sequences are iOS-only; Android text input uses a different abstraction.

Verdict

Worth a look if you’re building agentic workflows that need to build, verify, and iterate on real mobile UIs without human babysitting. Probably overkill if you just need occasional manual simulator screenshots.

Frequently asked

What is lycorp-jp/sim-use?
sim-use translates mobile simulator screens into token-efficient outlines so AI agents can tap, swipe, and type by element alias instead of raw coordinates.
Is sim-use open source?
Yes — lycorp-jp/sim-use is open source, released under the Apache-2.0 license.
What language is sim-use written in?
lycorp-jp/sim-use is primarily written in Swift.
How popular is sim-use?
lycorp-jp/sim-use has 526 stars on GitHub.
Where can I find sim-use?
lycorp-jp/sim-use is on GitHub at https://github.com/lycorp-jp/sim-use.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.