← all repositories
llmsresearch/paperbanana

When your paper needs a figure but your Illustrator skills peaked in 2012

An unofficial open-source rebuild of Google Research's PaperBanana that turns raw methodology text into publication-ready diagrams via a multi-agent pipeline.

paperbanana
Velocity · 7d
+15
★ / day
Trend
steady
star history

What it does Feed it a text description of your method — or an entire PDF — and PaperBanana spits out a publication-quality diagram or statistical plot. It runs a small bureaucracy of up to 7 specialized agents: an input optimizer, a retriever that picks from 13 reference templates, a planner, a stylist chasing NeurIPS aesthetics, a visualizer, and a critic that iterates until satisfied or until you hit the 30-iteration safety cap. Outputs land in timestamped folders with all intermediate drafts preserved.

The interesting bit The pipeline treats diagram generation like a peer-review process rather than a single prompt. The critic agent actually evaluates its own output against the source context and writes a revision memo for the next round. There’s also a batch mode with composite figure stitching — useful when you’re generating an entire paper’s worth of (a), (b), (c) panels from one YAML manifest.

Key highlights

  • Supports OpenAI (GPT-5.2 + GPT-Image-1.5), Azure OpenAI/Foundry, and Google Gemini free tier
  • Optional input optimization layer parallelizes “context enrichment” and “caption sharpening”
  • CLI, Python API, local Gradio Studio, and MCP server for IDE integration
  • Claude Code skills for /generate-diagram, /generate-plot, and /evaluate-diagram
  • PDF inputs with per-page selection via optional PyMuPDF dependency
  • Batch generation with auto-stitched composite figures

Caveats

  • This is explicitly unofficial and community-driven; the README disclaims any affiliation with Google Research or the original authors
  • Default models (GPT-5.2, GPT-Image-1.5) are speculative naming — the README uses these identifiers but they may not map to publicly available endpoints as labeled
  • Requires external API keys; no local-only mode is mentioned

Verdict Worth a spin if you’re churning out ML papers and would rather automate the figure grind than wrestle with TikZ. Skip it if you need pixel-perfect control or can’t stomach cloud API costs for iterative image generation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.