Your coding agent can now export to MP4
A local meta-layer that turns HTML, CSS, and a prompt into rendered video — no cloud fees, no new DSL to learn.

What it does
html-video is a local studio and CLI that lets coding agents (Claude Code, Cursor, Aider, etc.) generate real MP4s from HTML and CSS templates. You paste a link, describe a video, or point at a repo; the agent builds a multi-frame storyboard, fills it with your content, and renders it locally via headless Chromium and ffmpeg. Apache-2.0, no per-render fees.
The interesting bit
The project treats video engines as pluggable backends behind a single render(input, ctx) adapter. Today only the Hyperframes engine (Chromium + ffmpeg) is wired up; Remotion, Motion Canvas, and Manim are on the roadmap but not yet built. The pitch is that you never learn the engine’s DSL — the agent and template system absorb that complexity. Whether that abstraction holds when those adapters actually land is the bet.
Key highlights
- 21 curated HTML/CSS templates (data viz, kinetic type, cinematic frames, etc.) with live previews in the studio gallery
- Content-graph storyboard: agent structures source material into nodes and edges, topo-sorted into frame sequences
- Source-aware generation: fetches web articles (including WeChat 公众号) and GitHub repos server-side, flattens to Markdown, builds video from actual content not canned filler
- Optional AI soundtrack via MiniMax (music + narration), mixed at export
- 13 coding agents auto-detected on PATH; switchable from studio top bar
- Single-frame fast path skips storyboard for simple template renders
Caveats
- Only the Hyperframes engine is actually implemented; the pluggable architecture is real but the other engines are “planned” or “researching”
- The AI soundtrack requires a MiniMax call; everything else runs locally but that step is external
- README notes the studio fetches sources server-side — unclear if that means a nexu-io-hosted component or local server; the WeChat example suggests some non-local infrastructure
Verdict
Worth a look if you’re already living in agent-driven workflows and want video output without subscribing to a cloud render farm. Skip it if you need production reliability today across multiple render engines — this is a promising scaffold, not a finished multi-backend system.