evilsocket/cake
A distributed multimodal AI inference server written in Rust supporting text, image, and voice synthesis models across heterogeneous clusters.

Cake is a Rust-based inference server that runs AI models either on a single node or sharded across a heterogeneous cluster of devices spanning iOS, Android, macOS, Linux, and Windows. It supports multiple modality types including text generation, image generation via Stable Diffusion and FLUX, and voice synthesis with VibeVoice TTS including voice cloning. The system auto-detects model architectures from HuggingFace checkpoints and provides an OpenAI-compatible API.