remsky/Kokoro-FastAPI
A Dockerized FastAPI service that wraps the Kokoro-82M text-to-speech model with an OpenAI-compatible API endpoint.

This repository provides a production-ready API wrapper for the Kokoro-82M text-to-speech model, enabling multi-language TTS generation (English, Spanish, French, Hindi, Italian, Japanese, Portuguese, Mandarin). It offers an OpenAI-compatible speech endpoint, per-word timestamped captions, and voice mixing through weighted combinations. The project ships multi-platform Docker images supporting CPU inference, NVIDIA CUDA GPUs, AMD ROCm GPUs, and Apple Silicon MPS backends.