Date

2 years ago

Size

481.53 MB

Organization

Publish URL

brightbenchmark.github.io

Paper URL

arxiv.org

* This dataset supports online use.Click here to jump.

This dataset is a new text retrieval benchmark launched in 2024 by the University of Hong Kong, Princeton University, University of Washington, and Google Cloud AI Research.BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval". BRIGHT is the first text retrieval benchmark that requires deep reasoning to retrieve relevant documents. The research team collected 1,385 real queries from different fields (StackExchange, LeetCode, and math competitions), all of which are from real artificial data. The team paired these queries with web pages linked in StackExchange answers and theorems marked in Mathematical Olympiad problems. It is specifically designed to evaluate and challenge the performance of retrieval systems when handling complex queries. These queries require not only keyword matching, but also deep reasoning capabilities to identify relevant documents. Simply put, BRIGHT tests whether the retrieval system can "understand" the logic and context behind the query, not just the surface text. For example, an economist wants to find documents about "how human activities affect the climate system." This question is not just about keyword matching, but requires understanding the relationship between human activities (such as deforestation and urbanization) and climate change.

BRIGHT.torrent

Seeding 1Downloading 0Completed 190Total Downloads 336

BRIGHT/
- README.md
  2.15 KB
- README.txt
  4.3 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset

Discuss on Discord

Date

2 years ago

Size

481.53 MB

Organization

Publish URL

brightbenchmark.github.io

Paper URL

arxiv.org

* This dataset supports online use.Click here to jump.

BRIGHT.torrent

Seeding 1Downloading 0Completed 190Total Downloads 336

BRIGHT/
- README.md
  2.15 KB
- README.txt
  4.3 KB

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

2 months ago

Groundsource Global Flood Events Dataset

3 months ago

CHIMERA General Inference Synthetic Dataset

4 months ago

CL-bench Context Learning Evaluation Benchmark Dataset

4 months ago

Nemotron-Math-v2 Mathematical Inference Dataset

5 months ago

MCIF Multimodal Cross-Language Instruction Following Dataset

6 months ago

TxT360-3efforts Multi-Task Inference Dataset

6 months ago

LongBench-Pro Long Context Comprehensive Evaluation Dataset

6 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

BRIGHT Text Retrieval Benchmark Dataset

* This dataset supports online use.Click here to jump.

Build AI with AI

HyperAI Newsletters

Command Palette

BRIGHT Text Retrieval Benchmark Dataset

* This dataset supports online use.Click here to jump.

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Groundsource Global Flood Events Dataset

CHIMERA General Inference Synthetic Dataset

CL-bench Context Learning Evaluation Benchmark Dataset

Nemotron-Math-v2 Mathematical Inference Dataset

MCIF Multimodal Cross-Language Instruction Following Dataset

TxT360-3efforts Multi-Task Inference Dataset

LongBench-Pro Long Context Comprehensive Evaluation Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

BRIGHT Text Retrieval Benchmark Dataset

* This dataset supports online use.Click here to jump.

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Groundsource Global Flood Events Dataset

CHIMERA General Inference Synthetic Dataset

CL-bench Context Learning Evaluation Benchmark Dataset

Nemotron-Math-v2 Mathematical Inference Dataset

MCIF Multimodal Cross-Language Instruction Following Dataset

TxT360-3efforts Multi-Task Inference Dataset

LongBench-Pro Long Context Comprehensive Evaluation Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Groundsource Global Flood Events Dataset

CHIMERA General Inference Synthetic Dataset

CL-bench Context Learning Evaluation Benchmark Dataset

Nemotron-Math-v2 Mathematical Inference Dataset

MCIF Multimodal Cross-Language Instruction Following Dataset

TxT360-3efforts Multi-Task Inference Dataset

LongBench-Pro Long Context Comprehensive Evaluation Dataset

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Groundsource Global Flood Events Dataset

CHIMERA General Inference Synthetic Dataset

CL-bench Context Learning Evaluation Benchmark Dataset

Nemotron-Math-v2 Mathematical Inference Dataset

MCIF Multimodal Cross-Language Instruction Following Dataset

TxT360-3efforts Multi-Task Inference Dataset

LongBench-Pro Long Context Comprehensive Evaluation Dataset