Is parlor open source?

Yes — fikrikarim/parlor is open source, released under the Apache-2.0 license.

What language is parlor written in?

fikrikarim/parlor is primarily written in HTML.

How popular is parlor?

fikrikarim/parlor has 1.9k stars on GitHub.

Where can I find parlor?

fikrikarim/parlor is on GitHub at https://github.com/fikrikarim/parlor.

← all repositories

fikrikarim/parlor

Your laptop is now a voice AI that actually sees you

A weekend project proves you don't need OpenAI's servers—or an RTX 5090—to run real-time multimodal voice conversations locally.

★1.9k stars HTML Agents Inference · Serving Image · Video · Audio

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Parlor is a browser-based voice assistant that runs entirely on your machine. You talk, point your camera at things, and it talks back. The heavy lifting happens locally via a FastAPI server: Google’s Gemma 4 E2B model handles speech and vision understanding through LiteRT-LM, while Kokoro generates text-to-speech responses. A simple WebSocket pipes audio and JPEG frames from your browser to the server and streams synthesized speech back.

The interesting bit

The author built this to solve a real sustainability problem—he was self-hosting a free English-learning voice AI for hundreds of users and needed to kill the server bill. Six months ago that required an RTX 5090. Now it runs on an M3 Pro laptop with ~3 GB RAM. The “barge-in” feature is a nice touch: you can interrupt the AI mid-sentence, which is harder to get right than it sounds when everything is streaming in real time.

Key highlights

End-to-end latency of ~2.5–3.0 seconds on Apple M3 Pro (1.8–2.2s for speech/vision understanding, 0.3s for ~25 tokens, 0.3–0.7s for TTS)
Decode speed: ~83 tokens/sec on GPU via LiteRT-LM
Sentence-level TTS streaming means audio starts before the full response is finished
Browser-based VAD (Silero) for hands-free operation, no push-to-talk button
Platform-aware TTS: MLX on Mac, ONNX on Linux
~2.6 GB model download on first run, auto-fetched from HuggingFace

Caveats

Explicitly marked “research preview” with expected rough edges and bugs
macOS requires Apple Silicon; Linux needs a supported GPU
Python 3.12+ only, and the frontend is a single index.html—don’t expect a polished UI
The author notes you “can’t do agentic coding with this”; it’s narrowly scoped to conversation

Verdict

Worth a spin if you’re building local AI assistants, teaching language learners, or just want to see how far small models have come. Skip it if you need reliability, broad hardware support, or anything beyond a conversational demo—the author is upfront that this is an early experiment, not a product.

Frequently asked

What is fikrikarim/parlor?: A weekend project proves you don't need OpenAI's servers—or an RTX 5090—to run real-time multimodal voice conversations locally.
Is parlor open source?: Yes — fikrikarim/parlor is open source, released under the Apache-2.0 license.
What language is parlor written in?: fikrikarim/parlor is primarily written in HTML.
How popular is parlor?: fikrikarim/parlor has 1.9k stars on GitHub.
Where can I find parlor?: fikrikarim/parlor is on GitHub at https://github.com/fikrikarim/parlor.