OmniMedVQA Large-Scale Medical VQA Evaluation Dataset
Date
Size
Publish URL
Categories

OmniMedVQA is a large-scale visual question answering (VQA) evaluation dataset focusing on the medical field. This dataset was jointly launched by the University of Hong Kong and the Shanghai Artificial Intelligence Laboratory in 2024 to provide an evaluation benchmark for the development of large multimodal models in medicine.
Features of the OmniMedVQA dataset include:
- Large scale and diversity: The dataset contains 118,010 different images, covering 12 different modalities and involving more than 20 different organs and parts of the human body.
- Real medical scenarios:All images are from real medical scenes, ensuring consistency with the needs of the medical field and suitable for evaluating Large Vision-Language Models (LVLMs).
- Multimodal tasks: The dataset is designed to evaluate the performance of LVLMs in multimodal tasks, especially in medical visual question answering.
- Comprehensive: OmniMedVQA is a comprehensive evaluation benchmark that not only includes medical images of multiple modalities, but also covers a wide range of anatomical regions, making it suitable for evaluating the potential and performance of LVLMs in the medical field.
- Publicly available: The dataset will be publicly available to the research community to promote research and model development in the field of medical visual question answering.
OmniMedVQA was created to address the deficiencies in diversity and realism of existing medical image datasets and to advance the development and evaluation of medical AI by providing rich image and question pairs based on real medical scenarios.
Disclaimer
OmniMedVQA is built on multiple public datasets, aiming to take from the community and give back to the community, providing researchers and developers with a resource for academic and technical research. Any individual or organization using this dataset (hereinafter collectively referred to as "user") must comply with the following disclaimer:
- Dataset source: This dataset is built based on multiple public datasets. The sources of these datasets have been clearly indicated in the paper. Users should comply with the relevant licenses and terms of use of the original datasets.
- Data Accuracy: Although we have tried to ensure the accuracy and completeness of the dataset, users should bear the risks and responsibilities that may arise from using the dataset.
- Limitation of liability: In no event shall the provider of the dataset and related contributors be liable for any actions or results of the user.
- Usage restrictions: Users should comply with applicable laws, regulations and ethical standards when using this dataset. Users may not use this dataset for illegal, privacy-invading, defamatory, discriminatory or other illegal or immoral purposes.
- Intellectual Property: The intellectual property rights of all image data in this dataset belong to the relevant rights holders of the original dataset. Users shall not infringe the intellectual property rights of the dataset in any way.
- As a non-profit organization, the team advocates a harmonious and friendly open source communication environment. If you find any content in the open source dataset that infringes your legal rights, please contact us and we will do our best to assist you.
- By downloading, copying, accessing or using this dataset, the user indicates that he has read, understood and agreed to abide by all the terms and conditions in this disclaimer. If any part of this disclaimer is unacceptable to the user, please do not use this dataset.