← all repositories
RHVoice/RHVoice

A TTS engine that trades naturalness for your disk space

RHVoice generates intelligible speech from statistical models rather than raw audio chunks, making it small enough to run comfortably on low-end devices and screen readers.

1.8k stars C++ Image · Video · Audio
RHVoice
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

RHVoice is a free, open-source text-to-speech synthesizer supporting Russian, English, Portuguese, Esperanto, Georgian, Ukrainian, Kyrgyz, Tatar, Macedonian, Albanian, and Polish. It runs on Windows, Linux, and Android, plugging into standard TTS interfaces like SAPI5, Speech Dispatcher, and Android’s native APIs. It also ships its own driver for the NVDA screen reader.

The interesting bit

Instead of concatenating recorded speech fragments, RHVoice uses statistical parametric synthesis via HTS — storing only compressed statistical models rather than audio samples. The README is admirably honest about the trade-off: voices are less natural-sounding than concatenative alternatives, but the footprint stays tiny and intelligibility remains high. It’s a deliberate bet that “good enough and small” beats “nearly human and bloated” for accessibility use cases.

Key highlights

  • Voices are pure statistical models — no raw speech segments shipped to users
  • Native integration with NVDA screen reader (driver included)
  • Android builds available on both F-Droid and Google Play
  • Documentation and legal info maintained in English, Russian, and Ukrainian
  • Active community channels: GitHub Discussions, mailing list, IRC, and Matrix

Caveats

  • The README notes voices “lack the naturalness” of concatenative synthesizers — worth weighing if your use case prioritizes smoothness over size
  • Prebuilt binaries exist for Windows, but Linux users must compile or find packages; the build process isn’t detailed in the README itself
  • “In theory” support for additional languages is possible, but contingent on finding or creating resources

Verdict

Screen reader users, accessibility developers, and anyone building TTS for resource-constrained devices should look here. If you need broadcast-quality narration or are allergic to compiling on Linux, look elsewhere.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.