SearchSavior/OpenArc
An inference engine for Intel devices that serves LLMs, VLMs, Whisper, TTS, and embedding models via OpenAI-compatible API endpoints.

Velocity · 7d
+0.9
★ / day
Trend
→steady
star history
OpenArc is an inference engine for Intel hardware (CPU/GPU/NPU) that serves a variety of AI models including LLMs, VLMs, Whisper, Kokoro-TTS, Qwen-TTS, Qwen-ASR, and embedding/reranker models over OpenAI-compatible endpoints. It is powered by OpenVINO and supports features like speculative decoding, multi-GPU pipeline parallelism, CPU offload, and hybrid device deployment. The project targets local, private AI inference.