a month ago

Paper2Video: Automatic Video Generation from Scientific Papers

Zeyu Zhu Kevin Qinghong Lin Mike Zheng Shou

Abstract

Academic presentation videos have become an essential medium for researchcommunication, yet producing them remains highly labor-intensive, oftenrequiring hours of slide design, recording, and editing for a short 2 to 10minutes video. Unlike natural video, presentation video generation involvesdistinctive challenges: inputs from research papers, dense multi-modalinformation (text, figures, tables), and the need to coordinate multiplealigned channels such as slides, subtitles, speech, and human talker. Toaddress these challenges, we introduce PaperTalker, the first benchmark of 101research papers paired with author-created presentation videos, slides, andspeaker metadata. We further design four tailored evaluation metrics--MetaSimilarity, PresentArena, PresentQuiz, and IP Memory--to measure how videosconvey the paper's information to the audience. Building on this foundation, wepropose PaperTalker, the first multi-agent framework for academic presentationvideo generation. It integrates slide generation with effective layoutrefinement by a novel effective tree search visual choice, cursor grounding,subtitling, speech synthesis, and talking-head rendering, while parallelizingslide-wise generation for efficiency. Experiments on Paper2Video demonstratethat the presentation videos produced by our approach are more faithful andinformative than existing baselines, establishing a practical step towardautomated and ready-to-use academic video generation. Our dataset, agent, andcode are available at https://github.com/showlab/Paper2Video.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Paper2Video: Automatic Video Generation from Scientific Papers

Zeyu Zhu Kevin Qinghong Lin Mike Zheng Shou

Abstract

Build AI with AI

Hyper Newsletters