omnimind-ai/OpenOmniBot

Android's first proper AI butler, not just a chatbot

An on-device agent that actually taps, swipes, and schedules on your phone instead of just talking about it.

★1.7k stars Kotlin Agents Computer Vision

View on GitHub ↗ Homepage ↗

Velocity · 7d

+20

★ / day

Trend

→steady

star history

What it does

OpenOmniBot is an Android AI agent built in Kotlin and Flutter that runs entirely on-device. It sees your screen through vision models, performs gestures, manipulates apps, manages calendars and alarms, and can even run a local Alpine environment or terminal. The loop is explicit: understand → decide → execute → reflect.

The interesting bit

Most “AI assistants” stop at text. This one treats your phone as a physical environment to operate. It also supports a remote Codex bridge, so you can pair it with OpenAI’s CLI tool running on a laptop via LAN and QR-code pairing—an odd but practical workaround for serious coding tasks.

Key highlights

Vision-driven UI automation: screenshots, accessibility service, and gesture execution
Extensible skill system via git repo links (community collection at OpenMinis/MinisSkills)
Local inference option with MNN or llama.cpp backends, or cloud model APIs
Scheduled tasks with subagent delegation, plus short/long-term memory with embeddings
Embedded Alpine environment and ReTerminal integration for proper Linux tooling on Android

Caveats

Build process is involved: two separate editions (standard vs. omniinfer), nested git submodules for local inference, and Flutter/Android toolchain fragility (the README literally includes a flutter clean troubleshooting step)
Memory embedding requires a separate embedding model; multimodal models are strongly recommended for core scenarios, which implies significant setup and likely API costs
1,608 stars suggests early traction but not yet battle-tested at scale

Verdict

Worth a look if you want an autonomous Android agent that actually does things rather than just suggesting them. Skip it if you need a polished consumer app today, or if your threat model can’t tolerate broad accessibility-service permissions.