HyperAI

Large language models often produce erroneous statements known as "hallucinations." The Known Unknowns task aims to probe this failure mode by testing whether the model can correctly identify when the answer to a question is unknown. The goal of this task is to evaluate whether the model can avoid the preference for incorrect predictions and instead acknowledge its uncertainty when faced with unknown truths. This helps improve the model's reliability and transparency, enhancing its credibility in real-world applications.

No Data

No benchmark data available for this task

HyperAI

No Data

No benchmark data available for this task

Command Palette

Known Unknowns

Command Palette

Known Unknowns

Command Palette

Known Unknowns