Self-hosted anime companion for the terminally online
A fully open-source VTuber stack that lets you run your own "Neuro-sama" style AI character locally, with the hardware requirements of a modest gaming PC.

What it does
AIwaifu is a self-hosted VTuber pipeline: a local LLM (Pygmalion 1.3B) generates dialogue, a translation model renders it into Japanese for “cuter” VITS text-to-speech, and VTube Studio handles the avatar animation. The author split the architecture into a RAM-hungry inference server and a lightweight client, so you can offload the model to a home server while running the animated face on your desktop.
The interesting bit
The project wears its influences openly—“inspired by Neuro-sama,” “simpable,” “LEWDABLE”—but the practical commitment is stricter: no proprietary models, no ChatGPT, no censorship hooks. Everything runs offline or on your own metal. The author even flags that the requirements.txt is bloated and untrimmed, which is either refreshing honesty or a warning, depending on your tolerance for dependency archaeology.
Key highlights
- Fully offline stack: Pygmalion 1.3B + Facebook NLLB-600M translation + VITS TTS + VTube Studio integration
- Split client/server architecture lets you run inference on a separate machine (12GB RAM minimum, 16GB recommended)
- GPU inference path: 8GB VRAM minimum, Nvidia-only, tested on K80
- Supports custom datasets and personality fine-tuning
- Explicitly open-source ethos: no proprietary LLMs, no API keys, no content filtering
Caveats
- Setup is involved: requires Python 3.8, Poetry or venv, C/C++ build tools, CMake, Git LFS, manual compilation of a monotonic_align module, plus VTube Studio and a separate audio plugin purchase/installation
- The README warns “Sometime shit can be broke (Especially in the server)” and notes the requirements.txt contains “bloated” packages
- Japanese TTS output is hardcoded; English-native voices would require swapping the VITS model
- Docker/cloud deployment is “planned… but not soon”
Verdict
Worth a weekend if you already run local LLMs and want to animate them for streaming or Discord presence. Skip it if you want something that installs in five minutes or runs on a MacBook Air.