HyperAI
2 days ago

MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos

Rongsheng Wang, Junying Chen, Ke Ji, Zhenyang Cai, Shunian Chen, Yunjin Yang, Benyou Wang
MedGen: Unlocking Medical Video Generation by Scaling
  Granularly-annotated Medical Videos
Abstract

Recent advances in video generation have shown remarkable progress inopen-domain settings, yet medical video generation remains largelyunderexplored. Medical videos are critical for applications such as clinicaltraining, education, and simulation, requiring not only high visual fidelitybut also strict medical accuracy. However, current models often produceunrealistic or erroneous content when applied to medical prompts, largely dueto the lack of large-scale, high-quality datasets tailored to the medicaldomain. To address this gap, we introduce MedVideoCap-55K, the firstlarge-scale, diverse, and caption-rich dataset for medical video generation. Itcomprises over 55,000 curated clips spanning real-world medical scenarios,providing a strong foundation for training generalist medical video generationmodels. Built upon this dataset, we develop MedGen, which achieves leadingperformance among open-source models and rivals commercial systems acrossmultiple benchmarks in both visual quality and medical accuracy. We hope ourdataset and model can serve as a valuable resource and help catalyze furtherresearch in medical video generation. Our code and data is available athttps://github.com/FreedomIntelligence/MedGen