Is aiavatarkit open source?

Yes — uezo/aiavatarkit is open source, released under the Apache-2.0 license.

What language is aiavatarkit written in?

uezo/aiavatarkit is primarily written in Python.

How popular is aiavatarkit?

uezo/aiavatarkit has 645 stars on GitHub.

Where can I find aiavatarkit?

uezo/aiavatarkit is on GitHub at https://github.com/uezo/aiavatarkit.

← all repositories

uezo/aiavatarkit

A Python framework for talking anime heads and VRChat bots

AIAvatarKit wires up speech-to-speech pipelines so you can build conversational avatars without writing glue code for VAD, STT, LLM, and TTS yourself.

★645 stars Python Chat Assistants Agents Inference · Serving

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

AIAvatarKit is a Python framework that handles the full speech-to-speech loop for conversational AI avatars. It ingests audio, detects voice activity, transcribes speech, runs it through an LLM, synthesizes a response, and drives facial expressions and lip sync. Out of the box it targets VRChat, cluster, and Vket Cloud, but also runs standalone via WebSocket or HTTP with browser-based UIs.

The interesting bit

The project treats the avatar as a platform-agnostic backend service rather than a client-side toy. It abstracts LLM differences behind a unified interface, supports server-side conversation history via OpenAI’s Responses API, and can stream progress updates during slow tool calls so the avatar doesn’t just freeze mid-sentence.

Key highlights

Modular pipeline: swap VAD (Silero, built-in silence detection), STT (Google, Azure, OpenAI, AmiVoice), LLM (ChatGPT, Claude, Gemini, Dify, LiteLLM), and TTS (VOICEVOX, OpenAI, Style-Bert-VITS2)
WebSocket variant of OpenAI Responses API claims up to 40% latency reduction in tool-call-heavy workflows
Agent-native: tool calls, dynamic tool calls, background execution, and MCP support
Character management with diaries, schedules, long-term memory, and automated daily updates
Runs on Raspberry Pi and integrates with Twilio for telephony

Caveats

README warns that technical blog posts may reference pre-v0.6 APIs; some features require downgrading to v0.5.8 to match older tutorials
Claude and Gemini support are API-only (no Amazon Bedrock or Vertex AI)
OpenAI Responses API variants don’t support Dynamic Tool Calls due to server-side history management

Verdict

Worth a look if you’re building voice-driven NPCs, VTuber backends, or telephony bots and don’t want to hand-roll the audio pipeline. Skip it if you just need a chat widget — this is overkill without the speech and avatar layers.

Frequently asked

What is uezo/aiavatarkit?: AIAvatarKit wires up speech-to-speech pipelines so you can build conversational avatars without writing glue code for VAD, STT, LLM, and TTS yourself.
Is aiavatarkit open source?: Yes — uezo/aiavatarkit is open source, released under the Apache-2.0 license.
What language is aiavatarkit written in?: uezo/aiavatarkit is primarily written in Python.
How popular is aiavatarkit?: uezo/aiavatarkit has 645 stars on GitHub.
Where can I find aiavatarkit?: uezo/aiavatarkit is on GitHub at https://github.com/uezo/aiavatarkit.