← all repositories
normal-computing/fuji-web

Your browser, but with an intern who actually reads the manual

Fuji-Web is a sidepanel AI agent that sees your tabs, clicks buttons, and fills forms while narrating every move.

599 stars TypeScript AgentsCoding Assistants
fuji-web
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does Fuji-Web is a Chrome extension that parks an AI agent in your browser’s sidepanel. You type a task—“book a flight to Sapporo” or “find the cheapest USB-C cable on this page”—and it navigates the site, clicks elements, enters text, and extracts information, step by step, while explaining what it’s doing. It runs entirely client-side; your OpenAI or Anthropic API key stays in browser storage, never hitting a third-party server.

The interesting bit The agent doesn’t just parse HTML soup. It annotates the live page with visual labels (inspired by Microsoft’s UFO research), giving the LLM a kind of “augmented reality” view of clickable elements. This bridges the gap between how humans see pages and how models reason about structure.

Key highlights

  • Sidepanel UI keeps the agent visible without hijacking your tab
  • Supports both OpenAI and Anthropic APIs, with prompts sent directly to your chosen provider
  • Open-source React/Vite extension; build from source with pnpm
  • Roadmap includes Puppeteer/Playwright integration, cross-tab workflows, and a shared knowledge base
  • Acknowledged debt to TaxyAI’s browser extension and a Chrome boilerplate project

Caveats

  • Requires manual install from GitHub releases (not in Chrome Web Store)
  • You must bring your own API key; no free tier or bundled credits
  • Page refresh sometimes needed for the extension to activate

Verdict Worth a look if you’re building browser automation tools or want to study how LLMs can interact with live DOMs. Skip it if you need production reliability today—the roadmap is ambitious and the install friction is real.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.