bAbi Question Answering and Text Understanding Dataset
Date
6 years ago
Publish URL
Categories
* This dataset is available online.Click here to jump.
The QA bAbI tasks are training datasets for question answering and text comprehension in the bAbI project. They are used to test the first set of 20 tasks in text comprehension and reasoning. Each task has 1,000 questions for training and 1,000 questions for testing.
The dataset consists of a set of contexts, based on which multiple question-answer pairs can be used. Currently there are several directories:
- en / – English tasks, human readable;
- hn/ – Hindi assignment, human readable;
- Shuffle/ – Perform the same task with random letters, which are not readable by humans and cannot be directly used by existing parsers and taggers, thus making the learner more dependent on the given training data.
- The same task in three formats: en-10k/, shuffled – 10k/, and hn – 10k/, with 10,000 training examples.
The QA bAbI tasks dataset was released by Jason Weston of Google, Antoine Bordes of Facebook and others in 2015. The related paper is "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks".