Is ms-swift open source?

Yes — modelscope/ms-swift is open source, released under the Apache-2.0 license.

What language is ms-swift written in?

modelscope/ms-swift is primarily written in Python.

How popular is ms-swift?

modelscope/ms-swift has 14.9k stars on GitHub and is currently holding steady.

Where can I find ms-swift?

modelscope/ms-swift is on GitHub at https://github.com/modelscope/ms-swift.

← all repositories

modelscope/ms-swift

Fine-Tuning 600+ LLMs Without Swapping Tools Every Week

ms-swift exists because juggling separate toolchains for LoRA experiments, Megatron-scale pre-training, and multimodal RLHF is a waste of engineering hours.

★14.9k stars Python ML Frameworks LLMOps · Eval

View on GitHub ↗ Homepage ↗

Velocity · 7d

+11

★ / day

Trend

→steady

star history

What it does ms-swift is a Python framework that wraps the entire lifecycle of large model customization—training, inference, evaluation, quantization, and deployment—into a single toolchain. It supports lightweight PEFT methods like LoRA and QLoRA, full-parameter pre-training and fine-tuning, and a laundry list of alignment algorithms from DPO to the entire GRPO family. The framework claims day-zero support for over 600 text-only models and 400 multimodal models, including recent releases like Qwen3, Llama4, and DeepSeek-R1.

The interesting bit The real value is in the boring infrastructure: it abstracts distributed strategies—DeepSpeed, FSDP, and Megatron parallelism with TP, PP, EP, and SP—so you can move from a single-GPU LoRA test to a multi-node MoE full-parameter run without switching codebases. It also bundles sequence parallelism (Ulysses and Ring-Attention), multimodal packing for 100%+ speedups, and inference backends like vLLM and SGLang under one roof.

Key highlights

Supports 600+ text LLMs and 400+ multimodal LLMs, with built-in templates for 150+ datasets.
Integrates Megatron parallelism specifically to accelerate MoE model training, plus full GRPO-family RL algorithms (DAPO, GSPO, SAPO, CHORD, etc.).
Claims quantized training on a 7B model with as little as 9GB VRAM via BNB, AWQ, and GPTQ support.
Provides a Web UI and OpenAI-compatible inference endpoints, covering the full pipeline from training to deployment.
Paper accepted at AAAI 2025.

Caveats

The README lists a v4.0 release dated March 2026, which is almost certainly a typo, but it hints at occasional doc drift amid rapid iteration.
“Significant speedup” claims for Megatron LoRA on MoE models lack specific numbers in the README, so you’ll need to benchmark your own architecture.

Verdict ms-swift is for teams that want one framework to handle everything from quick LoRA experiments to full-parameter GRPO runs across text and multimodal models. If you only ever fine-tune one small model on a single GPU, it is probably overkill.

Frequently asked

What is modelscope/ms-swift?: ms-swift exists because juggling separate toolchains for LoRA experiments, Megatron-scale pre-training, and multimodal RLHF is a waste of engineering hours.
Is ms-swift open source?: Yes — modelscope/ms-swift is open source, released under the Apache-2.0 license.
What language is ms-swift written in?: modelscope/ms-swift is primarily written in Python.
How popular is ms-swift?: modelscope/ms-swift has 14.9k stars on GitHub and is currently holding steady.
Where can I find ms-swift?: modelscope/ms-swift is on GitHub at https://github.com/modelscope/ms-swift.