Free proxies that trick Claude Code into using Chinese LLMs
A collection of Windows executables that intercept Anthropic's CLI tool and reroute requests to Nvidia-hosted Chinese models like Kimi, GLM, and DeepSeek.

What it does
This repo distributes pre-built Windows .exe files that act as local proxies between Anthropic’s Claude Code CLI and various Chinese large language models. You run the proxy, point Claude Code at it, and your requests go to Nvidia’s free tier of models like Kimi K2.5, GLM 4.7, Qwen3 Coder, Step-3.5-flash, or DeepSeek-V3.2 instead of Anthropic’s own API.
The interesting bit
The whole thing is essentially a protocol shim: Claude Code speaks Anthropic’s API format, these proxies translate on the fly, and Nvidia’s free model hosting foots the compute bill. It’s a neat hack for zero-cost access, though the README offers no detail on how the translation works.
Key highlights
- Five model variants, all routing through Nvidia’s free tier
- Single
.exeper model (or shared binary for most); download and run - WeChat bot component listed as “已下线” (offline) — no longer active
- Documentation lives entirely in external WeChat articles (Chinese language)
- No source code visible in the repo; binaries only
Caveats
- No source code provided — you’re running opaque Windows executables from a stranger’s GitHub
- README is extremely sparse; all technical detail is offloaded to WeChat links
- The shared
claude-nvidia-proxy.exeacross four models suggests hardcoded or lightly-configured differences
Verdict
Worth a look if you’re on Windows, comfortable with binary-only tools, and want to freeload on Nvidia’s model hosting through Claude Code’s interface. Everyone else — especially the source-code-paranoid and non-Windows crowd — should keep scrolling.