nyrahealth/CrisperWhisper
A Whisper-based automatic speech recognition model that provides verbatim transcription with accurate word-level timestamps and filler detection.

Velocity · 7d
+1.3
★ / day
Trend
→steady
star history
CrisperWhisper extends OpenAI’s Whisper to produce exact transcriptions of spoken audio, including disfluencies, fillers like um and uh, pauses, and false starts. It achieves improved word-level timestamp accuracy through an adjusted tokenizer and custom attention loss during training. The model was trained to minimize hallucinations and achieved 1st place on the OpenASR Leaderboard.