← all repositories

Henry-23/VideoChat

Real-time voice-interactive digital human system supporting customizable appearance, voice, and cloning with sub-3s latency.

VideoChat
Velocity · 7d
+2.1
★ / day
Trend
steady
star history

VideoChat is a real-time interactive digital human demo supporting both end-to-end and cascade architectures. The end-to-end approach uses multimodal LLMs (GLM-4-Voice) for direct speech-to-speech generation, while the cascade approach chains ASR (FunASR), LLM (Qwen), TTS (GPT-SoVITS/CosyVoice), and talking head generation (MuseTalk) pipelines. Users can customize the avatar appearance and voice characteristics, including voice cloning support.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.