HyperAIHyperAI

Command Palette

Search for a command to run...

Quora Duplicate Questions Text Classification Research Dataset

Date

3 years ago

Size

55.48 MB

Organization

Quora

Publish URL

data.quora.com

The Quora Duplicate Questions Dataset is a dataset for determining whether question pairs in text are duplicates. It is used for text classification research and aims to provide anyone with the opportunity to train and test semantically equivalent models.

The dataset consists of over 400,000 rows of potential question-duplicate pairs, with each row containing the question ID, the full text of the question, and a binary value indicating whether the row contains a duplicate pair.

This dataset was released by the Quora team in 2017, with the main publishers being Shankar Iyer, Nikhil Dandekar, and Kornél Csernai.

quora_duplicate_questions.torrent
Seeding 3Downloading 0Completed 800Total Downloads 1,465
  • quora_duplicate_questions/
    • README.md
      1.15 KB
    • README.txt
      2.29 KB
      • data/
        • quora_duplicate_questions.tsv
          55.48 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Quora Duplicate Questions Text Classification Research Dataset | Datasets | HyperAI