The Overnight Rewrite: How a Source Map Leak Spawned an Open Agent Harness

Staff Writer

Claw Code is a clean-room Rust-and-Python reimplementation of Claude Code's architecture, born from an accidental npm exposure and built in the open before the DMCA notices could land.

ultraworkers/claw-code

★194.9k stars Velocity · 7d +24 ★/day ↗accelerating

star history

View on GitHub ↗

The Leak That Wasn’t a Hack

On March 31, 2026, security researcher Chaofan Shou noticed something odd in the npm registry: version 2.1.88 of @anthropic-ai/claude-code shipped with a 59.8 MB JavaScript source map file attached. Source maps are debugging artifacts—they translate minified production code back into readable TypeScript. They are not supposed to ship in public packages. The mechanism wasn’t exotic. Bun’s bundler, which Claude Code uses internally, generates source maps by default unless explicitly disabled. An open Bun bug filed weeks earlier reported that source maps appear in production builds even when they shouldn’t. If that’s what caused the leak, then Anthropic’s own build toolchain shipped a known issue that exposed their own product. Nobody had to hack anything. The file was just there, pointing to a zip archive sitting on Anthropic’s own Cloudflare R2 storage bucket.

Anthropic confirmed the incident as “a release packaging issue caused by human error, not a security breach.” This was, notably, the second time it had happened—a nearly identical source map leak occurred with an earlier version in February 2025. What was exposed: approximately 512,000 lines of TypeScript across roughly 1,900 files—the query engine, tool system, multi-agent orchestration logic, context compaction, and 44 feature flags covering functionality built but not yet shipped. No customer data, no API credentials, no model weights. But those feature flags are strategically sensitive. Compiled code sitting behind flags that evaluate to false in the external build isn’t just implementation detail—it’s a product roadmap. Competitors can now see what Anthropic has built and is considering shipping. That strategic surprise can’t be un-leaked.

The Response: Build Before the Takedown

Within hours of the exposure, mirrored repositories appeared on GitHub. Anthropic began issuing DMCA takedowns. The internet did not wait. Sigrid Jin, a Korean developer previously profiled by the Wall Street Journal as one of the world’s most active Claude Code power users having consumed over 25 billion Claude Code tokens in the past year, published what became claw-code. The repo reached 50,000 stars in two hours, one of the fastest accumulation rates GitHub has recorded. The current repository of record, ultraworkers/claw-code, is a community-maintained continuation of that initial burst.

The important distinction: claw-code is not an archive of the leaked TypeScript. It’s a clean-room rewrite, built from scratch by reading the original harness structure and reimplementing the architectural patterns without copying Anthropic’s proprietary source. Jin built the initial version overnight using oh-my-codex, an orchestration layer on top of OpenAI’s Codex, with parallel code review and persistent execution loops. Whether clean-room reimplementation fully sidesteps copyright exposure is a legal question, not a technical one. The repo is explicit: “This repository does not claim ownership of the original Claude Code source material.” But framing and legal outcome aren’t the same thing, and the disclaimer sits there like a flag on uncertain ground.

What the Architecture Reveals

The real value here—for builders—isn’t the drama. It’s what the exposed architecture tells us about how production-grade agentic coding systems are actually structured. Claude Code isn’t a single chatbot loop. The community analysis that spread widely after the leak describes its core: one agent loop, 40+ discrete tools, on-demand skill loading, context compression, subagent spawning, a task system with dependency graphs, and worktree isolation for parallel execution.

The subagent pattern is worth understanding. When a task risks filling the primary context window, the system spawns independent agent instances with their own context and task scope. Exploratory work doesn’t contaminate the main thread. Tasks run in parallel without blocking each other. What’s notable is how minimal the orchestration layer is. It doesn’t impose rigid workflows. It provides tools, manages permissions and context, and gets out of the way. The model does the reasoning. The harness creates conditions for that reasoning to be useful.

Claw Code reimplements these patterns in a hybrid Python-and-Rust stack. The Python layer handles agent orchestration and LLM integration; the Rust layer, comprising roughly 73% of the codebase in the migration branch, targets performance-critical paths. The architecture includes 19 built-in, permission-gated tools—file reading, Bash execution, web scraping, LSP integration, Git operations—each implemented as a self-contained tool with granular permission controls. The query engine manages all LLM API calls, response streaming, caching strategies, and multi-step orchestration with provider-agnostic design, configurable turn limits, and budget controls. MCP (Model Context Protocol) support spans six transport types: Stdio, SSE, HTTP, WebSocket, SDK, and ClaudeAiProxy.

The Rust Migration and Its Context

The Rust port isn’t merely a performance optimization. It sits at the intersection of two larger trends: AI-assisted coding making language migration more feasible, and Rust gaining traction as a systems language for security-critical infrastructure. Microsoft’s distinguished engineer Galen Hunt has stated his goal to eliminate every line of C and C++ from Microsoft by 2030, using AI and algorithms to rewrite the company’s largest codebases. Meta has already rewritten its WhatsApp MP4 processing library in Rust, calling it the largest global rollout of any Rust library. Linus Torvalds has declared himself “a huge believer” in using AI to maintain code, while Rust has graduated to being a co-equal language with C for mainstream Linux development.

But the migration caveats are real. As Rosario Mastrogiacomo, chief strategy officer for Sphere Technology Solutions, notes: large language models are good at translating syntax, “shape-mapping” APIs, and producing first-pass rewrites that compile. The hard part of language migration is rarely syntax. It’s semantics and invariants: memory ownership assumptions, lifetime rules, concurrency contracts, error-handling behavior, performance characteristics, and compatibility expectations. AI helps you move faster, but it doesn’t automatically know what “correct” means in a production subsystem without strong specs, tests, fuzzing, and human review. Christopher Robinson, chief security architect at the Open Source Security Foundation, adds that AI lacks the context that exists in older, large, complex systems, and that’s before defending against hallucinations or other unique nuances of these large AI systems.

Claw Code’s own README acknowledges this friction explicitly. The PARITY.md file tracks gaps between the Rust port and the original implementation. The claw doctor command exists as a first health check after building. The project doesn’t pretend completeness; it builds in honest accounting of what hasn’t caught up yet.

Current State: Build-From-Source, Handle With Care

As of the repository’s current state, claw-code is build-from-source only. The cargo install claw-code command installs a deprecated stub that prints a rename notice and exits. The actual binary comes from building the Rust workspace directly or installing the upstream agent-code crate, which installs an agent binary—not agent-code—a naming confusion that has already tripped up early adopters.

The project also carries rough edges typical of rapid community forks. ACP/Zed daemon support is tracked in the roadmap but not yet implemented; claw acp serve returns status with exit code 0 as a “discoverability alias only.” Windows support exists but requires PowerShell-specific path handling. The documentation is extensive—almost to the point of defensive over-communication—reflecting the project’s need to establish trust and clarity in a legally ambiguous space.

What This Means for the Agent Ecosystem

Claw Code’s existence, regardless of its legal future, has already shifted the agentic coding landscape. It demonstrates that the core architecture of a sophisticated proprietary tool can be reconstructed in the open within hours, not months. It validates that the “harness” pattern—minimal orchestration, rich tool system, model-driven reasoning—is replicable. And it raises questions about what constitutes competitive moat in AI tooling when the architecture, if not the implementation, can be inferred and rebuilt.

The project’s 48,000+ GitHub stars and 56,000 forks suggest demand for an open, inspectable alternative to vendor-controlled agent tools. Whether that demand sustains depends on whether the community can maintain parity with a moving target—Anthropic continues shipping Claude Code updates, and the leaked feature flags suggest capabilities not yet public—and whether the legal uncertainty resolves in a way that permits continued development.

For now, Claw Code stands as a case study in several overlapping phenomena: the speed of open-source reconstruction after proprietary exposure, the feasibility of AI-assisted language migration, the architectural convergence of agentic coding systems around tool-rich, permission-gated harnesses, and the unresolved tension between clean-room reimplementation and intellectual property law in the age of large language models.