← all repositories
krillinai/KrillinAI

A video localization factory that speaks agent

Go-based pipeline turns YouTube links into translated, dubbed, platform-formatted videos — and exposes every stage as CLI skills for AI agents to orchestrate.

KrillinAI
Velocity · 7d
+19
★ / day
Trend
steady
star history

What it does KrillinAI runs the full drudge-work chain of video localization: download via yt-dlp, Whisper-based transcription, LLM translation with terminology replacement, TTS dubbing (CosyVoice or OpenAI), and reformatting for landscape or portrait output. It targets the specific platform geometries of Bilibili, Douyin, TikTok, YouTube, and others. Human users get a desktop or web UI; machines get a staged CLI and a skills/ directory of stable contracts.

The interesting bit The project treats “AI Agent” as a first-class user, not a buzzword. Each pipeline stage emits structured artifacts and a krillinai_manifest.json manifest so subsequent stages can resume without re-running transcription. The CLI outputs a single JSON line to stdout on completion — built for being shelled out to, not merely tolerated.

Key highlights

  • Staged CLI commands: subtitle, tts, render-horizontal, render-vertical, pipeline, cover
  • Multiple Whisper backends: OpenAI cloud, FasterWhisper (local), WhisperKit (Apple Silicon), WhisperCpp, plus Alibaba Cloud ASR for mainland China
  • LLM-agnostic: any OpenAI API-compatible endpoint, including local deployments
  • Desktop app exists but README notes it “has some bugs that are continuously being updated”; server/web UI is the stable path
  • macOS requires manual quarantine stripping and chmod for both versions — unsigned binaries

Caveats

  • Desktop version explicitly flagged as newer and buggier; server/web deployment is the conservative choice
  • macOS users must run xattr and chmod commands before either version will launch
  • TTS options are limited to Alibaba Cloud Voice Service and OpenAI TTS; no local TTS engine listed

Verdict Worth a look if you run content localization at volume or want to wire video translation into an automated workflow. Skip it if you need a polished one-click consumer app — the rough edges are documented, not hidden.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.