madroidmaq/mlx-omni-server
A local inference server for Apple Silicon using MLX, exposing OpenAI and Anthropic API-compatible endpoints for running LLMs, audio, and image generation models.

Velocity · 7d
+1.2
★ / day
Trend
→steady
star history
MLX Omni Server is a model inference serving system designed for Apple Silicon (M1-M4 chips). It leverages Apple’s MLX framework for hardware-accelerated local inference and implements OpenAI and Anthropic API-compatible endpoints, allowing drop-in replacement with existing SDKs. The server supports a complete AI suite including chat, speech-to-speech, text-to-speech, image generation, and embeddings.