HotpotQA Question Answering Dataset
Date
3 years ago
Size
673.69 MB
Publish URL
License
CC BY-SA 4.0

The HotpotQA dataset is a large-scale question-answering dataset collected on English Wikipedia, including 113,000 crowdsourced questions. To answer these questions, you need to refer to the introduction paragraphs of two Wikipedia articles. Each question contains two gold paragraphs and a list of sentences in some paragraphs. The supporting facts provided in these sentence lists are considered necessary to answer the question.
The dataset has the following characteristics:
- Questions require looking up and reasoning across multiple supporting documents to answer;
- The problems are diverse and not constrained by any pre-existing knowledge base or knowledge schema;
- The dataset provides sentence-level supporting facts required for reasoning, allowing QA systems to reason and explain predictions under strong supervision;
- This dataset provides a new type of fact comparison problem to test the ability of QA systems to extract relevant facts and make necessary comparisons.
HotpotQA.torrent
Seeding 2Downloading 1Completed 409Total Downloads 792