andimarafioti/faster-qwen3-tts
A PyTorch-based inference engine for Qwen3-TTS voice cloning that uses CUDA graph capture for real-time audio generation.

Velocity · 7d
+9.9
★ / day
Trend
→steady
star history
This repository provides optimized real-time text-to-speech inference using Qwen3-TTS, a neural TTS model from the Qwen family. It leverages torch.cuda.CUDAGraph to capture and replay computation graphs for fast generation without Flash Attention or Triton dependencies. The library supports both streaming (chunk-by-chunk audio output) and non-streaming modes, enabling voice cloning from reference audio and text pairs.