Quora Duplicate Questions Text Classification Research Dataset
Date
2 years ago
Size
55.48 MB
Publish URL
Categories
The Quora Duplicate Questions Dataset is a dataset for determining whether question pairs in text are duplicates. It is used for text classification research and aims to provide anyone with the opportunity to train and test semantically equivalent models.
The dataset consists of over 400,000 rows of potential question-duplicate pairs, with each row containing the question ID, the full text of the question, and a binary value indicating whether the row contains a duplicate pair.
This dataset was released by the Quora team in 2017, with the main publishers being Shankar Iyer, Nikhil Dandekar, and Kornél Csernai.
quora_duplicate_questions.torrent
Seeding 2Downloading 0Completed 680Total Downloads 1,330