HyperAI

SPIQA Multimodal Scientific Paper Question Answering Dataset

Date

9 months ago

Size

1.28 GB

Organization

Google Research
Johns Hopkins University

Publish URL

huggingface.co

This dataset was launched by a research team from Google Research and Johns Hopkins University in 2024. The relevant paper results are “SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers".

Background

Finding answers to questions in long scientific research articles is an important research area, which can help readers quickly resolve their queries. However, existing question answering (QA) datasets based on scientific papers are limited in size and focus only on text content. To address this limitation, the research team introduced SPIQA (Scientific Paper Image Question Answering).

Dataset Overview

This is the first large-scale QA dataset specifically designed to interpret complex graphics and tables in scientific research articles in various fields of computer science. It leverages the expertise and ability to understand graphics of multimodal large language models (MLLMs). The research team designed an information search task involving multiple images, covering a variety of charts, tables, diagrams, and result visualizations, using automatic and manual curation to create the dataset. SPIQA contains 270K questions, divided into training, validation, and three different evaluation parts. Through extensive experiments on 12 well-known base models, the team evaluated the ability of current multimodal systems to understand subtle aspects of research articles.

SPIQA.torrent
Seeding 1Downloading 1Completed 72Total Downloads 75
  • SPIQA/
    • README.md
      1.95 KB
    • README.txt
      3.89 KB
      • data/
        • spiqa.zip
          1.28 GB