janhq/cortex.cpp
A C++-based local AI API platform that serves language, vision, and speech models using llama.cpp and ONNX runtimes.

Velocity · 7d
+2.8
★ / day
Trend
→steady
star history
Cortex.cpp is an open-source local inference server that provides a REST API for running AI models on consumer hardware. It wraps llama.cpp and ONNXRuntime to support GGUF quantized models and ONNX-format neural networks across vision, speech, and language modalities. The project provides platform installers for Windows, macOS, and Linux to simplify deployment of local AI inference.