HyperAI

PubMedVision Large-Scale Medical VQA Dataset

Date

10 months ago

Size

53.54 GB

Organization

Publish URL

github.com

* This dataset supports online use.Click here to jump.

PubMedVision is a large-scale and high-quality medical multimodal dataset created in 2024 by a research team from Shenzhen Big Data Research Institute, the Chinese University of Hong Kong, and the National Health Data Institute. It contains 1.3 million medical VQA samples.HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale".

This dataset uses sophisticated data processing methods to select medical-related images and informative image descriptions from papers in the international medical journal PubMed, effectively filtering out a large number of medical-irrelevant images and context-irrelevant content. In order to improve the alignment of image and text data, the research team used the large visual model (GPT-4V) to re-describe the images and constructed 10 scene dialogues, rewriting the image and text data into a question-and-answer format, which enhanced the learning of medical visual knowledge.

PubMedVision.torrent
Seeding 1Downloading 0Completed 111Total Downloads 477
  • PubMedVision/
    • README.md
      1.46 KB
    • README.txt
      2.93 KB
      • data/
        • PubMedVision.zip
          53.54 GB