← all repositories
marcelo-earth/generative-manim

LLMs that finally learned to draw — in Python

A project that turns text prompts into Manim animation code, with a clever training pipeline that uses the renderer itself as a reward signal.

872 stars Python Creative · Design
generative-manim
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

What it does

Generative Manim is a toolkit that feeds your text prompt to an LLM and gets back Python code for Manim, the math-animation engine. Run that code, and you have a video. The project wraps this in a demo web app, an API, and a cloud deployment guide. It supports a small zoo of models — OpenAI’s GPT-4o and GPT-5.5, several Claude variants, Google’s Gemini, and a growing set of open-weight models via Featherless.

The interesting bit

The real meat is the training pipeline for open-source models. They distill from GPT-4o through supervised fine-tuning, then DPO on render success/failure pairs, then GRPO — reinforcement learning where the Manim renderer itself acts as a deterministic reward signal. Code either runs or crashes; no need for a separate reward model. It’s the same trick DeepSeek-R1 uses with math answer checkers, applied to animation code.

Key highlights

  • 12+ model backends, from GPT-4o to Qwen 2.5 Coder, with a unified interface
  • 3-stage open-source training: SFT → DPO → GRPO, using QLoRA to fit on free Kaggle T4 GPUs
  • Executable benchmark suite with render-based scoring, pass@k evaluation, and JSONL reports
  • Includes a command-injection fix in ffmpeg export (credit to a contributor)
  • Active Discord community and multi-language docs

Caveats

  • The open-source models (Qwen, DeepSeek Coder, CodeLlama) are marked 🚧 — work in progress
  • The README calls this a “concept” and “prototype”; production polish is unclear
  • Benchmark is described as an “MVP”; maturity level is explicitly early

Verdict

Worth a look if you’re building LLM-to-code pipelines or need programmatic video generation without touching After Effects. Skip it if you want a polished, end-to-end consumer tool — this is still research-grade infrastructure with sharp edges.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.