Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Wei Pang, Kevin Qinghong Lin, Xiangru Jian, Xi He, Philip Torr

Release Date: 5/28/2025

Paper2Poster: Towards Multimodal Poster Automation from Scientific
Papers

Abstract

Academic poster generation is a crucial yet challenging task in scientificcommunication, requiring the compression of long-context interleaved documentsinto a single, visually coherent page. To address this challenge, we introducethe first benchmark and metric suite for poster generation, which pairs recentconference papers with author-designed posters and evaluates outputs on(i)Visual Quality-semantic alignment with human posters, (ii)TextualCoherence-language fluency, (iii)Holistic Assessment-six fine-grained aestheticand informational criteria scored by a VLM-as-judge, and notably(iv)PaperQuiz-the poster's ability to convey core paper content as measured byVLMs answering generated quizzes. Building on this benchmark, we proposePosterAgent, a top-down, visual-in-the-loop multi-agent pipeline: the (a)Parserdistills the paper into a structured asset library; the (b)Planner alignstext-visual pairs into a binary-tree layout that preserves reading order andspatial balance; and the (c)Painter-Commenter loop refines each panel byexecuting rendering code and using VLM feedback to eliminate overflow andensure alignment. In our comprehensive evaluation, we find that GPT-4ooutputs-though visually appealing at first glance-often exhibit noisy text andpoor PaperQuiz scores, and we find that reader engagement is the primaryaesthetic bottleneck, as human-designed posters rely largely on visualsemantics to convey meaning. Our fully open-source variants (e.g. based on theQwen-2.5 series) outperform existing 4o-driven multi-agent systems acrossnearly all metrics, while using 87% fewer tokens. It transforms a 22-page paperinto a finalized yet editable .pptx poster - all for just $0.005. Thesefindings chart clear directions for the next generation of fully automatedposter-generation models. The code and datasets are available athttps://github.com/Paper2Poster/Paper2Poster.

View Paper Details View Code