Your research paper, now with a Doraemon theme
An LLM pipeline that turns PDFs into slides or posters, complete with checkpoint recovery and natural-language styling.

What it does Feed it a PDF, Word doc, or even a PowerPoint, and Paper2Slides runs a four-stage pipeline: RAG indexing, content extraction, layout planning, and final image generation. Output is a set of PNG slides or a poster, optionally merged into a PDF. There’s a web UI, but the CLI is fully scriptable.
The interesting bit The checkpoint system is the quiet workhorse. Each stage writes a JSON checkpoint, so you can resume after a crash, switch styles without re-parsing, or regenerate images while keeping the same plan. The README also notes that “fine-grained layout instructions ground well; fine-grained element styling does not” — a useful, honest hint for prompt engineering.
Key highlights
- Supports PDF, Word, Excel, PowerPoint, and Markdown inputs
- Normal mode uses RAG for long documents;
--fastskips indexing for quick previews - Parallel generation (
--parallel N) speeds up multi-slide output - Custom styles via natural language prompts (the Totoro example is in the README)
- Image generation defaults to Gemini via OpenRouter, with a fallback to direct Google API
Caveats
- Requires external LLM and image-gen API keys; no local-only mode is mentioned
- Image generation model must support image responses, or you need to configure MIME types correctly
- The “fast” mode only works if the full document fits in the LLM context window
Verdict Worth a look if you regularly turn papers into conference posters or deck slides and want to automate the busywork. Skip it if you need pixel-perfect manual control or can’t route documents through third-party LLM APIs.