Is spark-vllm-docker open source?

Yes — eugr/spark-vllm-docker is open source, released under the MIT license.

What language is spark-vllm-docker written in?

eugr/spark-vllm-docker is primarily written in Shell.

How popular is spark-vllm-docker?

eugr/spark-vllm-docker has 1.9k stars on GitHub and is currently holding steady.

Where can I find spark-vllm-docker?

eugr/spark-vllm-docker is on GitHub at https://github.com/eugr/spark-vllm-docker.

← all repositories

eugr/spark-vllm-docker

Docker scripts that stop DGX Spark clusters from choking on vLLM

This repo exists because running vLLM across multiple DGX Sparks requires more than a base image—it needs NCCL workarounds, memory hacks, and model-specific recipes that upstream doesn't ship.

★1.9k stars Shell Inference · Serving LLMOps · Eval

View on GitHub ↗

Velocity · 7d

+5.0

★ / day

Trend

→steady

star history

What it does Provides Docker images and shell scripts to deploy vLLM on NVIDIA DGX Spark systems, scaling from a single node up to multi-node clusters connected via InfiniBand or RoCE. It handles the messy parts—NCCL configuration, distributed startup via Ray or native PyTorch, and model distribution across nodes—so you don’t have to wire up passwordless SSH and RDMA by hand every time. The build process pulls precompiled vLLM and FlashInfer wheels by default, skipping lengthy source compilation unless you explicitly ask for it.

The interesting bit Rather than just wrapping vLLM in a container, the project ships a growing collection of “mods”—runtime patches that fix specific landmines like the NCCL load-order bug that hangs multi-node Spark clusters, or a drop-caches mod that prevents fastsafetensors from freezing when loading massive models near RAM limits. There’s also a recipe system that encodes per-model launch flags for beasts like Qwen3.5-397B and StepFun Step 3.7 Flash, effectively turning finicky distributed inference into a one-liner.

Key highlights

Supports solo, dual, and 3-node mesh topologies with autodiscovery for model distribution
Prebuilt nightly wheels for vLLM and FlashInfer cut initial build time to roughly 2–3 minutes after the base image pull
Includes workarounds for known DGX Spark issues, including an NCCL soname redirect and memory-pressure loading formats (fastsafetensors, instanttensor)
Can patch and launch official vllm-openai containers via the use-official-vllm mod, even though they lack git
Active maintenance with model-specific recipes for MiniMax-M2, Gemma4, Qwen3.5/3.6, and Step 3.7 Flash

Caveats

The README warns that --load-format fastsafetensors can OOM if a model consumes more than ~85% of available RAM without the KV cache
Some recipes require specific cluster sizes or flags (e.g., Step 3.7 Flash needs at least two Sparks; its FP8 variant requires --no-ray to fit)
Because vLLM moves fast, the maintainers note that “some things may break” despite nightly testing

Verdict Worth bookmarking if you actually own DGX Spark hardware and want to run large models across nodes without becoming a distributed systems janitor. Everyone else—especially those without InfiniBand cables and a spare Spark—can safely keep scrolling.

Frequently asked

What is eugr/spark-vllm-docker?: This repo exists because running vLLM across multiple DGX Sparks requires more than a base image—it needs NCCL workarounds, memory hacks, and model-specific recipes that upstream doesn't ship.
Is spark-vllm-docker open source?: Yes — eugr/spark-vllm-docker is open source, released under the MIT license.
What language is spark-vllm-docker written in?: eugr/spark-vllm-docker is primarily written in Shell.
How popular is spark-vllm-docker?: eugr/spark-vllm-docker has 1.9k stars on GitHub and is currently holding steady.
Where can I find spark-vllm-docker?: eugr/spark-vllm-docker is on GitHub at https://github.com/eugr/spark-vllm-docker.