When your paper needs a figure but your Illustrator skills peaked in 2012
An unofficial open-source rebuild of Google Research's PaperBanana that turns raw methodology text into publication-ready diagrams via a multi-agent pipeline.

What it does Feed it a text description of your method — or an entire PDF — and PaperBanana spits out a publication-quality diagram or statistical plot. It runs a small bureaucracy of up to 7 specialized agents: an input optimizer, a retriever that picks from 13 reference templates, a planner, a stylist chasing NeurIPS aesthetics, a visualizer, and a critic that iterates until satisfied or until you hit the 30-iteration safety cap. Outputs land in timestamped folders with all intermediate drafts preserved.
The interesting bit The pipeline treats diagram generation like a peer-review process rather than a single prompt. The critic agent actually evaluates its own output against the source context and writes a revision memo for the next round. There’s also a batch mode with composite figure stitching — useful when you’re generating an entire paper’s worth of (a), (b), (c) panels from one YAML manifest.
Key highlights
- Supports OpenAI (GPT-5.2 + GPT-Image-1.5), Azure OpenAI/Foundry, and Google Gemini free tier
- Optional input optimization layer parallelizes “context enrichment” and “caption sharpening”
- CLI, Python API, local Gradio Studio, and MCP server for IDE integration
- Claude Code skills for
/generate-diagram,/generate-plot, and/evaluate-diagram - PDF inputs with per-page selection via optional PyMuPDF dependency
- Batch generation with auto-stitched composite figures
Caveats
- This is explicitly unofficial and community-driven; the README disclaims any affiliation with Google Research or the original authors
- Default models (GPT-5.2, GPT-Image-1.5) are speculative naming — the README uses these identifiers but they may not map to publicly available endpoints as labeled
- Requires external API keys; no local-only mode is mentioned
Verdict Worth a spin if you’re churning out ML papers and would rather automate the figure grind than wrestle with TikZ. Skip it if you need pixel-perfect control or can’t stomach cloud API costs for iterative image generation.