← all repositories
uezo/ChatdollKit

Unity SDK turns VRM models into voice-chat AI companions

A C# toolkit that wires lip-sync, facial expressions, and speech recognition to multiple LLMs so your 3D character can actually hold a conversation.

1.2k stars C# AgentsChat Assistants
ChatdollKit
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does ChatdollKit is a Unity SDK for building voice-enabled 3D chatbots from VRM models. It handles the full conversational loop: speech-to-text, LLM inference, text-to-speech, and synchronized animations including lip-sync, blinking, and facial expressions. It targets desktop, mobile, VR/AR, and WebGL.

The interesting bit The project treats conversation as a real-time performance problem, not just an API call. Recent releases added WebSocket streaming STT to shave “several hundred milliseconds” off latency, barge-in support so users can interrupt mid-sentence, and multi-VAD noise resistance for event venues. It also runs entirely in WebGL with JavaScript interop.

Key highlights

  • Pluggable LLMs: OpenAI, Anthropic Claude, Google Gemini, Grok, Dify, plus function calling and multimodal inputs
  • Broad TTS/STT support: Azure, Google, OpenAI, VOICEVOX, AivisSpeech, Style-Bert-VITS2, NijiVoice, with TTS preprocessing for pronunciation tuning
  • 3D expression system: autonomous animation, face control, idle behaviors, runtime VRM model switching
  • Conversation management: wake words, intent routing, context state, long-term memory via ChatMemory/mem0/Zep, dynamic multilingual switching
  • External control: socket commands, JavaScript control in WebGL, remote client support for VTuber-style setups

Caveats

  • Setup is multi-step: import dependencies, configure scene objects, attach API keys to three separate inspector components just to run the demo
  • README notes legacy component removal in 0.8.4 and refers to a separate migration guide for 0.7.x users
  • Some features (AIAvatarKit backend, AutoGen integration) are mentioned but not deeply documented in the visible README sections

Verdict Worth a look if you’re building interactive 3D characters, AI VTubers, or kiosk-style virtual agents in Unity. Skip it if you need a hosted, no-code solution or aren’t already in the Unity ecosystem.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.