← all repositories

madroidmaq/mlx-omni-server

A local inference server for Apple Silicon using MLX, exposing OpenAI and Anthropic API-compatible endpoints for running LLMs, audio, and image generation models.

mlx-omni-server
Velocity · 7d
+1.2
★ / day
Trend
steady
star history

MLX Omni Server is a model inference serving system designed for Apple Silicon (M1-M4 chips). It leverages Apple’s MLX framework for hardware-accelerated local inference and implements OpenAI and Anthropic API-compatible endpoints, allowing drop-in replacement with existing SDKs. The server supports a complete AI suite including chat, speech-to-speech, text-to-speech, image generation, and embeddings.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.