Yes — AutoArk/GPA is open source, released under the Apache-2.0 license.

What language is GPA written in?

AutoArk/GPA is primarily written in Python.

AutoArk/GPA has 1.6k stars on GitHub and is currently accelerating.

Where can I find GPA?

AutoArk/GPA is on GitHub at https://github.com/AutoArk/GPA.

AutoArk/GPA

One 0.6B-parameter transformer that listens and talks back

GPA aims to unify speech recognition, text-to-speech, and voice conversion in one compact autoregressive model so you can stop juggling separate audio pipelines.

★1.6k stars Python Image · Video · Audio

View on GitHub ↗ Homepage ↗

Velocity · 7d

+47

★ / day

Trend

↗accelerating

star history

What it does GPA-v1.5 is a 0.6B-parameter autoregressive transformer that handles speech understanding and generation through a single audio-language model. It ships with native PyTorch and Hugging Face workflows for ASR and TTS, plus an ONNX runtime that runs locally through a CLI, FastAPI service, or browser UI. A slimmed-down GPA-TTS variant is also available for edge deployment with selectable INT8, FP16, or FP32 decoders.

The interesting bit Rather than splitting ASR and TTS into separate encoder and diffusion stacks, GPA frames both as next-token prediction inside one transformer. The authors also extracted a standalone, quantized TTS runtime because, as they admit, “TTS is by far the most popular feature” in their demo.

Key highlights

One 0.6B model handles ASR and TTS; the README claims near-SOTA results on both.
Native training pipeline via Hugging Face Trainer and inference backends including PyTorch, vLLM, llama.cpp, SGLang, and mlx-lm.
ONNX runtime bundle supports local CPU execution with voice registration and a browser UI.
GPA-TTS spin-off offers INT4/INT8 quantized edge inference with zero-shot voice cloning.
Voice conversion is promised but not yet available in the v1.5 native path.

Caveats

Voice conversion support is on the roadmap; the current v1.5 release covers ASR and TTS only.
Several announced features—interactive demo, basic service deployment recipes, and RKNN support—are still marked “coming soon” or unchecked on the roadmap.
TTS similarity scores in the published benchmark tables trail behind larger or closed-source rivals despite competitive error rates.

Verdict Worth a look if you want a single, relatively small open model that can both transcribe and synthesize speech locally. Skip it for now if you need production-ready voice conversion or a fully polished hosted service.

Frequently asked

What is AutoArk/GPA?: GPA aims to unify speech recognition, text-to-speech, and voice conversion in one compact autoregressive model so you can stop juggling separate audio pipelines.
Is GPA open source?: Yes — AutoArk/GPA is open source, released under the Apache-2.0 license.
What language is GPA written in?: AutoArk/GPA is primarily written in Python.
How popular is GPA?: AutoArk/GPA has 1.6k stars on GitHub and is currently accelerating.
Where can I find GPA?: AutoArk/GPA is on GitHub at https://github.com/AutoArk/GPA.