Is faster-whisper open source?

Yes — SYSTRAN/faster-whisper is open source, released under the MIT license.

What language is faster-whisper written in?

SYSTRAN/faster-whisper is primarily written in Python.

How popular is faster-whisper?

SYSTRAN/faster-whisper has 24.5k stars on GitHub and is currently cooling off.

Where can I find faster-whisper?

SYSTRAN/faster-whisper is on GitHub at https://github.com/SYSTRAN/faster-whisper.

← all repositories

SYSTRAN/faster-whisper

Whisper, but actually fast: a drop-in speed demon

A reimplementation of OpenAI's Whisper that trades the original inference engine for CTranslate2 and gains up to 4× speed without sacrificing accuracy.

★24.5k stars Python Inference · Serving Image · Video · Audio

View on GitHub ↗

Velocity · 7d

+22

★ / day

Trend

↘cooling

star history

What it does

faster-whisper is a Python reimplementation of OpenAI’s Whisper speech-to-text model that swaps the original inference stack for CTranslate2, a dedicated fast inference engine for Transformers. It supports GPU and CPU execution, 8-bit quantization, batched inference, and word-level timestamps. The project also bundles a VAD filter (Silero) to skip silent audio segments and works with distilled Whisper variants.

The interesting bit

The speedup isn’t from a new model architecture—it’s from aggressively optimized inference. The benchmarks claim a large-v2 model on GPU drops from 2m23s to 1m03s versus OpenAI’s implementation, and batched int8 inference hits 16 seconds for 13 minutes of audio. On CPU it’s more modest, but int8 quantization still roughly halves memory use. The project has become infrastructure: WhisperX, speaches, and a half-dozen other tools build on it.

Key highlights

Up to 4× faster than openai/whisper at the same accuracy, per project benchmarks
8-bit quantization on both CPU and GPU for lower memory footprint
Batched inference pipeline (BatchedInferencePipeline) for throughput over latency
No system FFmpeg required—audio decoding via bundled PyAV libraries
Drop-in API compatibility with the original Whisper model interface
Active ecosystem: used by WhisperX, whisper-ctranslate2, speaches, and others

Caveats

GPU setup requires specific NVIDIA library versions (CUDA 12 + cuDNN 9); mismatches force awkward ctranslate2 downgrades
CPU performance without batching or quantization is actually slower than whisper.cpp in the project’s own benchmarks
The generator-based transcribe() return means transcription doesn’t start until you iterate—easy to footgun

Verdict

If you’re already running Whisper in Python and need more throughput without touching C++, this is the pragmatic upgrade. If you’re on CPU-only and want absolute minimal dependencies, whisper.cpp still wins on simplicity and single-thread speed.

Frequently asked

What is SYSTRAN/faster-whisper?: A reimplementation of OpenAI's Whisper that trades the original inference engine for CTranslate2 and gains up to 4× speed without sacrificing accuracy.
Is faster-whisper open source?: Yes — SYSTRAN/faster-whisper is open source, released under the MIT license.
What language is faster-whisper written in?: SYSTRAN/faster-whisper is primarily written in Python.
How popular is faster-whisper?: SYSTRAN/faster-whisper has 24.5k stars on GitHub and is currently cooling off.
Where can I find faster-whisper?: SYSTRAN/faster-whisper is on GitHub at https://github.com/SYSTRAN/faster-whisper.