← all repositories

intel/ipex-llm

Intel LLM acceleration library for GPU, NPU, and CPU enabling optimized inference and finetuning of 70+ LLMs.

ipex-llm
Velocity · 7d
+2.5
★ / day
Trend
steady
star history

IPEX-LLM provides hardware-accelerated LLM inference and finetuning on Intel XPU hardware including integrated GPUs, discrete GPUs (Arc, Flex, Max), and NPUs. It offers low-bit quantization support (FP8/FP6/FP4/INT4) and state-of-the-art LLM optimizations. The library integrates seamlessly with popular ecosystem tools including llama.cpp, Ollama, vLLM, HuggingFace transformers, LangChain, LlamaIndex, DeepSpeed, and Axolotl, supporting over 70 verified models such as LLaMA, Mistral, DeepSeek, Qwen, and Phi.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.