Date

2 years ago

Organization

Publish URL

github.com

Paper URL

arxiv.org

Tags

Natural Language Processing

This repository contains data and evaluation scripts for the HalluQA (Chinese Halluated Question Answering) benchmark. The full data for HalluQA is in HalluQA.json. The paper introducing HalluQA and detailed experimental results on several large Chinese language models are inhereHalluQA contains 450 carefully designed adversarial questions that span multiple domains and take into account Chinese historical culture, customs, and social phenomena.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Discuss on Discord

Date

2 years ago

Organization

Publish URL

github.com

Paper URL

arxiv.org

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

HalluQA Chinese Large Model Hallucination Evaluation Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

HalluQA Chinese Large Model Hallucination Evaluation Dataset

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

HalluQA Chinese Large Model Hallucination Evaluation Dataset

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Related Datasets

DRACO Cross-Disciplinary Deep Research Benchmark Dataset