← all repositories
Cay-Zhang/SwiftSpeech

Speech recognition that actually feels like SwiftUI

A wrapper around Apple's Speech framework that replaces boilerplate with view modifiers and Combine publishers.

529 stars Swift Other AI
SwiftSpeech
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

SwiftSpeech wraps Apple’s Speech framework so you can drop a record button into a SwiftUI view, attach modifiers like .swiftSpeechRecordOnHold() and .onRecognizeLatest(update: $text), and get transcribed speech without touching AVAudioEngine or SFSpeechRecognizer directly. It handles the mic and speech authorization prompts, though you still need to add the Info.plist usage descriptions yourself.

The interesting bit

The architecture separates view components (pure UI, no gesture handling) from functional components (the actual interaction logic). This means you can skin the record button however you want without rewriting the speech pipeline — or swap in a tap-to-toggle instead of hold-to-record by changing one modifier. The whole thing publishes through Combine, so you can chain onStartRecording(sendSessionTo:) into your existing reactive flow.

Key highlights

  • Three built-in demos (basic, colors, list) with Xcode preview support; “Preview on Device” required since the simulator lacks mic access
  • SwiftSpeech.Session exposes raw resultPublisher and stringPublisher for manual subscription if the modifiers aren’t enough
  • Supports locale configuration, custom vocabulary via contextualStrings, and on-device recognition options
  • Modifiers stack without overriding — closest to the functional component executes first
  • Also available via CocoaPods, though SPM is recommended

Caveats

  • iOS 13+ only; no macOS or watchOS mentioned in the README
  • The WeChat voice-message mock and additional examples live in a separate repo, not the main package
  • Authorization is “taken care of” but still requires manual Info.plist entries and an explicit .requestSpeechRecognitionAuthorization() call

Verdict

Worth a look if you’re building a SwiftUI app that needs voice input and you don’t want to drown in SFSpeechRecognizer boilerplate. Skip it if you need cross-platform support or want to deeply customize the audio pipeline — this is a convenience layer, not an engine replacement.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.