← all repositories

andimarafioti/faster-qwen3-tts

A PyTorch-based inference engine for Qwen3-TTS voice cloning that uses CUDA graph capture for real-time audio generation.

faster-qwen3-tts
Velocity · 7d
+9.9
★ / day
Trend
steady
star history

This repository provides optimized real-time text-to-speech inference using Qwen3-TTS, a neural TTS model from the Qwen family. It leverages torch.cuda.CUDAGraph to capture and replay computation graphs for fast generation without Flash Attention or Triton dependencies. The library supports both streaming (chunk-by-chunk audio output) and non-streaming modes, enabling voice cloning from reference audio and text pairs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.