Is PaperBanana open source?

Yes — dwzhu-pku/PaperBanana is open source, released under the Apache-2.0 license.

What language is PaperBanana written in?

dwzhu-pku/PaperBanana is primarily written in Python.

How popular is PaperBanana?

dwzhu-pku/PaperBanana has 6.7k stars on GitHub.

Where can I find PaperBanana?

dwzhu-pku/PaperBanana is on GitHub at https://github.com/dwzhu-pku/PaperBanana.

← all repositories

dwzhu-pku/PaperBanana

Let a Committee of Agents Illustrate Your Research

PaperBanana is a multi-agent framework that automates academic illustration by having a Retriever, Planner, Stylist, Visualizer, and Critic collaboratively turn your method section into publication-ready figures.

★6.7k stars Python Creative · Design Domain Apps

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

PaperBanana is a reference-driven multi-agent framework that converts scientific text into publication-quality diagrams and plots. It orchestrates five specialized agents—Retriever, Planner, Stylist, Visualizer, and Critic—to ingest your method section and figure caption, retrieve relevant reference examples, plan the layout, enforce academic style guidelines, generate visuals, and iteratively refine them. The system can operate in several modes ranging from barebones direct generation to the full pipeline with all agents engaged.

The interesting bit

The framework treats illustration as a team sport rather than a single prompt: the Critic agent forms a closed-loop with the Visualizer for multi-round improvements, and the Stylist agent synthesizes aesthetic guidelines automatically rather than relying on hand-tuned prompts. It is also deliberately modular, letting you swap in different vision-language models and image generators via OpenRouter or Google Gemini.

Key highlights

Reference-driven generation that learns in-context from curated academic diagrams in the PaperBananaBench dataset, though it can fall back to zero-shot generation if the dataset is absent.
Parallel generation of up to 20 candidate diagrams at once, with batch export and a refinement tab for upscaling to 2K or 4K resolution.
Multiple pipeline modes—from vanilla direct generation to dev_full with all five agents—exposed through both a Gradio web UI and a command-line interface.
Explicitly community-oriented and non-commercial; the authors forked this from Google Research’s PaperVizAgent and welcome third-party forks, including a Chinese-enhanced community version.

Caveats

Several headline features remain on the TODO list: code for statistical plots, manual example selection, and style-guided improvement of existing diagrams have not yet been uploaded.
The reference set is currently narrow, with expansion beyond computer science still pending.
The authors themselves warn that there is “a long way to go for more reliable generation and for more diverse, complex scenarios,” so expectations should be tempered accordingly.

Verdict

Researchers who would rather iterate on hypotheses than on figure layouts should try the online demo first. If you need guaranteed pixel-perfect figures for tomorrow’s deadline, keep a human illustrator on speed dial—the agents are promising but still experimental.

Frequently asked

What is dwzhu-pku/PaperBanana?: PaperBanana is a multi-agent framework that automates academic illustration by having a Retriever, Planner, Stylist, Visualizer, and Critic collaboratively turn your method section into publication-ready figures.
Is PaperBanana open source?: Yes — dwzhu-pku/PaperBanana is open source, released under the Apache-2.0 license.
What language is PaperBanana written in?: dwzhu-pku/PaperBanana is primarily written in Python.
How popular is PaperBanana?: dwzhu-pku/PaperBanana has 6.7k stars on GitHub.
Where can I find PaperBanana?: dwzhu-pku/PaperBanana is on GitHub at https://github.com/dwzhu-pku/PaperBanana.