HalluQA Chinese Large Model Hallucination Evaluation Dataset
Date
a year ago
Publish URL
Categories

This repository contains data and evaluation scripts for the HalluQA (Chinese Halluated Question Answering) benchmark. The full data for HalluQA is in HalluQA.json. The paper introducing HalluQA and detailed experimental results on several large Chinese language models are inhereHalluQA contains 450 carefully designed adversarial questions that span multiple domains and take into account Chinese historical culture, customs, and social phenomena.