A browser remote control that speaks CLI, not Python
Vercel's Rust-native tool lets AI agents drive Chrome through shell commands instead of bloated Playwright scripts.

What it does
agent-browser is a native Rust CLI that automates Chrome through the Chrome DevTools Protocol. You install it via npm, Homebrew, or Cargo, run agent-browser install to fetch Chrome for Testing, then issue shell commands like agent-browser open example.com, snapshot, click @e2, screenshot. No Python, no Playwright runtime, no Node.js daemon — just a fast binary talking directly to the browser.
The interesting bit
The accessibility-tree snapshot command returns numbered refs (@e1, @e2) that subsequent commands can target directly. This gives AI agents a stable, semantic coordinate system without wrestling with flaky CSS selectors. There’s also a batch mode that pipes JSON or quoted commands through stdin, cutting per-command process startup overhead for multi-step workflows.
Key highlights
- Native Rust binary, distributed through npm/Homebrew/Cargo with auto-updating
- Semantic locators:
find role button click --name "Submit",find text "Sign In" click - Network interception, HAR recording, mock responses, request filtering by type/method/status
- Clipboard, mouse, geolocation, device emulation, offline mode, basic auth headers
chatcommand for natural-language browser control (single-shot or REPL)
Caveats
- The
chatAI feature is mentioned but not documented in depth; unclear which model/provider it uses or if it requires API keys - Linux needs
--with-depsflag for system dependencies - README truncates mid-sentence on tab IDs, suggesting docs are still being written
Verdict
Worth a look if you’re building AI agents that need to manipulate browsers and you’re tired of Playwright’s heft. Skip it if you need cross-browser support (Chrome only) or deep integration with existing Node.js test suites.