Yes — antirez/ds4 is open source, released under the MIT license.

What language is ds4 written in?

antirez/ds4 is primarily written in C.

antirez/ds4 has 19k stars on GitHub and is currently cooling off.

Where can I find ds4?

antirez/ds4 is on GitHub at https://github.com/antirez/ds4.

antirez/ds4

A single-model inference engine that treats your SSD as KV cache

DwarfStar bets that DeepSeek V4 Flash deserves its own self-contained C engine, complete with disk-persistent KV cache and asymmetrical 2-bit quantization, instead of yet another generic GGUF runner.

★19k stars C Inference · Serving Language Models Coding Assistants

View on GitHub ↗

Velocity · 7d

+65

★ / day

Trend

↘cooling

star history

What it does DwarfStar is a native inference engine written in C that targets DeepSeek V4 Flash and PRO with no external runtime dependencies. It is intentionally not a generic GGUF loader; it only runs specific, project-provided quantized weights. The engine handles model loading, prompt rendering, tool calling, KV cache management in both RAM and on disk, an HTTP server API, and an integrated coding agent.

The interesting bit The project’s central bet is that modern SSDs are fast enough to make the KV cache a “first-class disk citizen,” letting the engine hold enormous context windows—up to one million tokens—on machines with finite RAM. To make this practical, it relies on aggressively asymmetrical 2-bit quantization that compresses only the routed MoE experts while leaving shared experts and projections untouched, allowing a 284B-parameter model to run on 96GB–128GB MacBooks.

Key highlights

Deliberately narrow scope: one model family, validated against official logits, with custom GGUFs and offline tooling rather than generic compatibility.
On-disk KV cache persistence designed for long-context inference on high-end personal machines.
Asymmetrical 2-bit quantization (IQ2_XXS for routed up/gate, Q2_K for routed down) keeps quality intact on shared components.
Supports tool calling, an HTTP API, and an integrated coding agent (ds4-agent, currently alpha).
Openly built with heavy assistance from GPT 5.5, with humans steering architecture and debugging.

Caveats

The code is explicitly beta quality and only a few days old; the ds4-agent is alpha.
The CPU path on macOS currently crashes the kernel due to an Apple virtual-memory bug, so Metal is effectively required on Apple Silicon.
PRO support is experimental and realistically limited to 512GB Mac Studio-class hardware; the ROCm backend lives in a community-rebased branch because the author lacks AMD hardware.

Verdict Developers with 96GB+ MacBooks or DGX Spark-class Linux boxes who want a finished, end-to-end DeepSeek V4 stack should look here; if you need a generic multi-model runner or have modest RAM, this is not your engine.

Frequently asked

What is antirez/ds4?: DwarfStar bets that DeepSeek V4 Flash deserves its own self-contained C engine, complete with disk-persistent KV cache and asymmetrical 2-bit quantization, instead of yet another generic GGUF runner.
Is ds4 open source?: Yes — antirez/ds4 is open source, released under the MIT license.
What language is ds4 written in?: antirez/ds4 is primarily written in C.
How popular is ds4?: antirez/ds4 has 19k stars on GitHub and is currently cooling off.
Where can I find ds4?: antirez/ds4 is on GitHub at https://github.com/antirez/ds4.