← all repositories
RanFeng/clipsketch-ai

Bilibili meets storyboard: an AI sketch pipeline for Chinese social video

A React app that ingests Bilibili and Xiaohongshu links, lets you tag frames, and feeds them to Gemini for hand-drawn storyboards and viral copy.

clipsketch-ai
Velocity · 7d
+9.1
★ / day
Trend
steady
star history

What it does

ClipSketch AI is a browser-based video annotation tool built for Chinese social-media creators. Paste a Bilibili or Xiaohongshu link, scrub through with frame-level keyboard controls, hit T to tag moments, then hand the tagged frames to Google Gemini. The model returns a unified hand-drawn storyboard, three flavors of “grass-growing” (product-seeding) copy, and a vertical cover image. Everything runs in a React 19 + Tailwind frontend with IndexedDB for local state; a Docker image is provided.

The interesting bit

The project treats Gemini as a full creative department rather than a chatbot: one model call synthesizes multiple tagged frames into a coherent visual narrative, another generates platform-native copy in three distinct voices (emotional story, dry tutorial, punchy micro-format). The README also notes a batch-processing mode and custom-character fusion, suggesting the author has actually burned through API quota tuning the pipeline.

Key highlights

  • Imports Bilibili and Xiaohongshu share links (including mixed-text shares) and proxies cross-origin video playback with referrerPolicy="no-referrer"
  • Frame-accurate tagging with T hotkey; exports TXT timelines or ZIPs of captured frames
  • Storyboard generation via gemini-3-pro-image-preview; copy and cover via gemini-3-pro-preview
  • Responsive layout that flips to vertical stack on mobile
  • Docker one-liner: docker run -p 3000:3000 earisty/clipsketch-ai:latest

Caveats

  • Requires a Google Cloud project with explicit access to gemini-3-pro-image-preview; expect 403s if your key lacks that model scope
  • Video playback relies on proxy workarounds that may break if platform CDN policies shift

Verdict

Worth a spin if you regularly repurpose Chinese short-video content into illustrated threads or Xiaohongshu posts. Skip it if you need generic video editing or don’t have a Gemini API key with preview-model access.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.