A single-file proxy that pickpockets Gemini's web UI for an OpenAI API
Because Google's free web chat doesn't have an official API, so someone built the unofficial one by reverse-engineering its private protocol.

What it does
gemini-web2api is a single Python file that runs a local server translating OpenAI-style API calls into Google Gemini’s internal web protocol. Point any OpenAI client at localhost:8081/v1 and chat with Gemini’s free web tier—no Google Cloud project, no billing, no API key required unless you want one.
The interesting bit
The project reverse-engineered Gemini’s StreamGenerate endpoint by studying the frontend JavaScript and mapping the MODE_CATEGORY enum to a mysterious field [79] in a protobuf-like payload. It also exposes Google’s native /v1beta/models endpoints, so the official Gemini CLI works against your local proxy.
Key highlights
- Zero dependencies beyond Python stdlib; runs anywhere
- Supports tool calling, SSE streaming, and adjustable thinking depth via
@think=Nmodel suffixes - Optional Bearer auth, or completely open if
api_keysis empty - Docker and proxy support (Clash, V2Ray, etc.) built in
- Codex CLI compatibility via
/v1/responsesendpoint
Caveats
- No image or multimodal input: Gemini uses a proprietary streaming RPC (
WIZ/ProcessFile) that can’t be proxied over standard HTTP - “Pro” model routing is cosmetic without authenticated cookies; it falls back to Flash
- Multi-turn context is faked by stuffing previous messages into the prompt; each request is stateless
- Google can and will rate-limit you if you hammer it
Verdict
Worth a look if you want a free, private LLM backend for local tools or self-hosted apps. Skip it if you need reliable vision support, real multi-turn memory, or production-grade uptime.