BoltzmannEntropy/MimikaStudio
A local-first macOS app that clones voices in seconds, generates TTS, reads documents aloud, and creates audiobooks using multiple ML voice synthesis engines.

MimikaStudio is a voice cloning and text-to-speech application that runs entirely on-device using Apple Silicon Metal acceleration via MLX. It supports voice cloning from 3-second reference audio using Qwen3-TTS, Chatterbox, and RVC engines, and generates high-quality TTS using Kokoro and Supertronic models. The app also reads documents aloud with synchronized highlighting (PDF, DOCX, EPUB) and converts them to audiobooks with queueable chapter generation. It operates as an agentic voice cloning server with a jobs queue for TTS, cloning, and audiobook pipelines, exposing both UI and API interfaces.