Is ipex-llm open source?

Yes — intel/ipex-llm is open source, released under the Apache-2.0 license.

What language is ipex-llm written in?

intel/ipex-llm is primarily written in Python.

How popular is ipex-llm?

intel/ipex-llm has 8.9k stars on GitHub.

Where can I find ipex-llm?

intel/ipex-llm is on GitHub at https://github.com/intel/ipex-llm.

← all repositories

intel/ipex-llm

Intel’s LLM accelerator for Arc, NPU, and CPU is now a museum piece

It let developers run and fine-tune 70+ LLMs on Intel graphics and NPUs without an Nvidia card, then Intel pulled the plug.

★8.9k stars Python Inference · Serving ML Frameworks

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does ipex-llm was a PyTorch library that optimized local inference and fine-tuning for large language models on Intel XPUs—covering integrated GPUs, discrete Arc and Max cards, Core Ultra NPUs, and CPUs. It acted as a compatibility shim, hooking into HuggingFace Transformers, llama.cpp, Ollama, vLLM, LangChain, and others while compressing models down to FP4 or INT4 to fit limited memory.

The interesting bit Late in its life, the project shipped FlashMoE, which claimed to run 671B-parameter MoE models like DeepSeek V3/R1 on just one or two consumer Arc GPUs. It also offered pre-packaged “portable zip” builds of Ollama and llama.cpp specifically tuned for Intel silicon, essentially curating open-source tools for its hardware stack.

Key highlights

Verified support for 70+ models, including Llama, Mistral, DeepSeek, Qwen, Phi, and Gemma3.
Low-bit quantization support: FP8, FP6, FP4, INT4, and INT2 via llama.cpp IQ2.
Broad integration with HuggingFace, llama.cpp, Ollama, vLLM, LangChain, LlamaIndex, DeepSpeed, FastChat, and Axolotl.
FlashMoE for running massive MoE models on 1–2 Intel Arc GPUs.
NPU support for Intel Core Ultra series and pipeline-parallel inference across multiple Arc cards.

Caveats

Intel has archived the project, ceased all development, and publicly identified known security issues; patches are no longer accepted.
The llama.cpp Portable Zip warns that mmap-based model loading may leak data via side-channels in shared or multi-tenant environments.
Future compatibility with new PyTorch releases, models, or OS updates is uncertain at best.

Verdict A viable reference if you are maintaining legacy Intel Arc or NPU deployments, but steer clear for new projects. If your hardware says “RTX,” this was never meant for you anyway.

Frequently asked

What is intel/ipex-llm?: It let developers run and fine-tune 70+ LLMs on Intel graphics and NPUs without an Nvidia card, then Intel pulled the plug.
Is ipex-llm open source?: Yes — intel/ipex-llm is open source, released under the Apache-2.0 license.
What language is ipex-llm written in?: intel/ipex-llm is primarily written in Python.
How popular is ipex-llm?: intel/ipex-llm has 8.9k stars on GitHub.
Where can I find ipex-llm?: intel/ipex-llm is on GitHub at https://github.com/intel/ipex-llm.