HyperAIHyperAI
2 months ago

PlotQA: Reasoning over Scientific Plots

Methani, Nitesh ; Ganguly, Pritha ; Khapra, Mitesh M. ; Kumar, Pratyush
PlotQA: Reasoning over Scientific Plots
Abstract

Existing synthetic datasets (FigureQA, DVQA) for reasoning over plots do notcontain variability in data labels, real-valued data, or complex reasoningquestions. Consequently, proposed models for these datasets do not fullyaddress the challenge of reasoning over plots. In particular, they assume thatthe answer comes either from a small fixed size vocabulary or from a boundingbox within the image. However, in practice, this is an unrealistic assumptionbecause many questions require reasoning and thus have real-valued answerswhich appear neither in a small fixed size vocabulary nor in the image. In thiswork, we aim to bridge this gap between existing datasets and real-world plots.Specifically, we propose PlotQA with 28.9 million question-answer pairs over224,377 plots on data from real-world sources and questions based oncrowd-sourced question templates. Further, 80.76% of the out-of-vocabulary(OOV) questions in PlotQA have answers that are not in a fixed vocabulary.Analysis of existing models on PlotQA reveals that they cannot deal with OOVquestions: their overall accuracy on our dataset is in single digits. This isnot surprising given that these models were not designed for such questions. Asa step towards a more holistic model which can address fixed vocabulary as wellas OOV questions, we propose a hybrid approach: Specific questions are answeredby choosing the answer from a fixed vocabulary or by extracting it from apredicted bounding box in the plot, while other questions are answered with atable question-answering engine which is fed with a structured table generatedby detecting visual elements from the image. On the existing DVQA dataset, ourmodel has an accuracy of 58%, significantly improving on the highest reportedaccuracy of 46%. On PlotQA, our model has an accuracy of 22.52%, which issignificantly better than state of the art models.