SYSTRAN/faster-whisper
A CTranslate2-based reimplementation of OpenAI Whisper for faster speech-to-text transcription with quantization support.

Velocity · 7d
+19
★ / day
Trend
→steady
star history
Faster-whisper is a reimplementation of OpenAI Whisper using the CTranslate2 inference engine for Transformer models. It achieves up to 4x speed improvement over the original openai/whisper implementation while using less memory. The library supports int8 quantization on both CPU and GPU, and enables batched inference for further throughput gains. It serves as an optimized runtime for running Whisper models in production environments.