jordanrendric/claude-video-vision
A Claude Code plugin enabling AI to watch and understand videos through frame extraction and multimodal audio analysis.

Velocity · 7d
+11
★ / day
Trend
→steady
star history
This plugin extends Claude Code with video comprehension by extracting frames via ffmpeg and transcribing audio through multiple backends (Gemini API, local Whisper, or OpenAI API). It acts as a perception layer, delivering frames as images and timestamped transcriptions to Claude. The MCP server supports YouTube URLs via yt-dlp and adaptively adjusts extraction parameters based on user queries.