Is sglang open source?

Yes — sgl-project/sglang is open source, released under the Apache-2.0 license.

What language is sglang written in?

sgl-project/sglang is primarily written in Python.

How popular is sglang?

sgl-project/sglang has 30.6k stars on GitHub and is currently accelerating.

Where can I find sglang?

sgl-project/sglang is on GitHub at https://github.com/sgl-project/sglang.

← all repositories

sgl-project/sglang

The inference stack that doubles as an RL backbone

SGLang exists to push low-latency, high-throughput inference for LLMs and multimodal models from a single GPU up to massive clusters.

★30.6k stars Python Inference · Serving Language Models Image · Video · Audio

View on GitHub ↗ Homepage ↗

Velocity · 7d

+42

★ / day

Trend

↗accelerating

star history

What it does SGLang is an open-source serving framework for large language and multimodal models. It handles inference across a wide hardware footprint—NVIDIA and AMD GPUs, Intel CPUs, Google TPUs, and Ascend NPUs—scaling from a single card up to large distributed clusters. The project claims deployments on over 400,000 GPUs worldwide and trillions of tokens served daily in production.

The interesting bit What started as a faster inference runtime has grown into a rollout backbone for reinforcement learning and post-training pipelines, with integrations into frameworks like verl, AReaL, and Google’s Tunix. It also stretches beyond text into diffusion models such as WAN and Qwen-Image, making it a general-purpose generative-model serving layer rather than just another LLM engine.

Key highlights

RadixAttention for prefix caching, speculative decoding, and prefill-decode disaggregation are core runtime optimizations.
Supports a wide model zoo: Llama, Qwen, DeepSeek, Mistral, GLM, GPT, Gemma, plus embedding, reward, and diffusion models.
Hardware coverage spans NVIDIA GB200/B300/H100 down to consumer 5090s, AMD MI300/MI355, Intel Xeon, Google TPUs, and Ascend NPUs.
Claims to power over 400,000 GPUs in production at organizations including xAI, LinkedIn, Cursor, and various cloud providers.
Acknowledges borrowing designs and code from vLLM, FlashInfer, Outlines, Guidance, LightLLM, and LMQL.

Verdict If you operate inference at scale—or build RL post-training pipelines that need a fast rollout backend—SGLang is worth evaluating. Hobbyists running occasional prompts on a single GPU probably don’t need this level of machinery.

Frequently asked

What is sgl-project/sglang?: SGLang exists to push low-latency, high-throughput inference for LLMs and multimodal models from a single GPU up to massive clusters.
Is sglang open source?: Yes — sgl-project/sglang is open source, released under the Apache-2.0 license.
What language is sglang written in?: sgl-project/sglang is primarily written in Python.
How popular is sglang?: sgl-project/sglang has 30.6k stars on GitHub and is currently accelerating.
Where can I find sglang?: sgl-project/sglang is on GitHub at https://github.com/sgl-project/sglang.