← all repositories

amanvirparhar/chaplin

Real-time silent speech recognition tool that reads lips via webcam and converts them to text, using an Auto-AVSR visual speech recognition model with LLM post-processing.

740 stars Python Computer VisionLanguage Models
chaplin
Velocity · 7d
+1.5
★ / day
Trend
steady
star history

Chaplin is a visual speech recognition system that captures video from a webcam, processes lip movements using the Auto-AVSR model trained on the Lip Reading Sentences 3 dataset, and converts silent mouthing into text. The raw VSR output is then corrected and typed at the cursor using a local LLM (qwen3:4b via ollama). The system runs entirely locally with no cloud dependencies, requiring users to press a key to toggle recording while mouthing words.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.