An abandoned voice assistant that teaches you to build one
A YouTube-accompanied PyTorch tutorial for wake-word detection and speech recognition, now seeking a new maintainer.

What it does This repo walks through building a voice assistant from scratch in PyTorch, with working code for wake-word detection and speech recognition. It includes training scripts, data collection utilities, and pre-trained models you can fine-tune on your own voice. The author paired it with a YouTube series and explicitly designed it as a learning exercise in end-to-end ML engineering.
The interesting bit The project treats voice assistants as a systems problem, not just a model problem. You collect your own audio, handle class imbalance by literally copying wake-word clips, optimize graphs for “production” deployment, and wire components together yourself. The README is essentially a lab manual with homework still undone.
Key highlights
- Wake-word engine: trainable, with scripts to record, chunk, and label your own audio
- Speech recognition: pre-trained model available via Google Drive, fine-tunable on ~1 hour of personal voice data using Mimic Recording Studio
- Graph optimization scripts to freeze PyTorch models for inference
- Docker support for both CPU and CUDA builds
- Web GUI demo for the speech recognizer
Caveats
- Author is no longer maintaining it and is actively seeking someone to take over
- Roughly half the system is TODO: no NLU, no speech synthesis, no skills framework, no core integration logic
- Windows support is spotty; torchaudio in particular may fail (WSL2 recommended)
- Raspberry Pi documentation is “in progress” — meaning it doesn’t exist yet
Verdict Grab this if you want a hands-on, soup-to-nuts introduction to voice ML and don’t mind finishing the job yourself. Skip it if you need something that actually talks back today, or if abandoned code makes you nervous.