Date

3 years ago

Size

55.48 MB

Organization

Publish URL

data.quora.com

Tags

Natural Language Processing

Deep Learning

Text Generation

The Quora Duplicate Questions Dataset is a dataset for determining whether question pairs in text are duplicates. It is used for text classification research and aims to provide anyone with the opportunity to train and test semantically equivalent models. The dataset consists of over 400,000 rows of potential question-duplicate pairs, with each row containing the question ID, the full text of the question, and a binary value indicating whether the row contains a duplicate pair. This dataset was released by the Quora team in 2017, with the main publishers being Shankar Iyer, Nikhil Dandekar, and Kornél Csernai.

quora_duplicate_questions.torrent

Seeding 3Downloading 0Completed 844Total Downloads 1,525

quora_duplicate_questions/
- README.md
  1.15 KB
- README.txt
  2.29 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset

Discuss on Discord

Date

3 years ago

Size

55.48 MB

Organization

Publish URL

data.quora.com

Related Datasets

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

2 months ago

Nemotron Personas France (French Synthetic Personas Dataset)

3 months ago

Groundsource Global Flood Events Dataset

4 months ago

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

9 days ago

RoVid-X Robot Video Generation Dataset

9 days ago

DeepPlanning Long-Term Planning Capability Assessment Dataset

5 months ago

Patient Segmentation Dataset

5 months ago

TransPhy3D Transparent Reflection Synthesis Video Dataset

5 months ago

Human Face Emotions Dataset

3 months ago

TxT360-3efforts Multi-Task Inference Dataset

9 days ago

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset

6 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Quora Duplicate Questions Text Classification Research Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

Quora Duplicate Questions Text Classification Research Dataset

Related Datasets

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

Nemotron Personas France (French Synthetic Personas Dataset)

Groundsource Global Flood Events Dataset

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

RoVid-X Robot Video Generation Dataset

DeepPlanning Long-Term Planning Capability Assessment Dataset

Patient Segmentation Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

Human Face Emotions Dataset

TxT360-3efforts Multi-Task Inference Dataset

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

Quora Duplicate Questions Text Classification Research Dataset

Related Datasets

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

Nemotron Personas France (French Synthetic Personas Dataset)

Groundsource Global Flood Events Dataset

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

RoVid-X Robot Video Generation Dataset

DeepPlanning Long-Term Planning Capability Assessment Dataset

Patient Segmentation Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

Human Face Emotions Dataset

TxT360-3efforts Multi-Task Inference Dataset

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

Nemotron Personas France (French Synthetic Personas Dataset)

Groundsource Global Flood Events Dataset

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

RoVid-X Robot Video Generation Dataset

DeepPlanning Long-Term Planning Capability Assessment Dataset

Patient Segmentation Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

Human Face Emotions Dataset

TxT360-3efforts Multi-Task Inference Dataset

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset

Related Datasets

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

Nemotron Personas France (French Synthetic Personas Dataset)

Groundsource Global Flood Events Dataset

Nemotron-Personas-Brazil Brazilian Synthetic Character Dataset

RoVid-X Robot Video Generation Dataset

DeepPlanning Long-Term Planning Capability Assessment Dataset

Patient Segmentation Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

Human Face Emotions Dataset

TxT360-3efforts Multi-Task Inference Dataset

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset