HyperAIHyperAI

HLE Human Question Reasoning Benchmark Dataset

Date

3 months ago

Size

227.35 MB

Publish URL

huggingface.co

Paper URL

arxiv.org

HLE stands for Humanity's Last Exam, a multimodal human problem benchmark dataset jointly released by the Center for AI Safety and Scale AI in 2025. The related paper results are:Humanity's Last Exam", aims to build the ultimate closed evaluation system covering the frontiers of human knowledge.

The dataset contains 2,500 questions covering dozens of subjects such as mathematics, humanities, and natural sciences, including multiple-choice questions and short-answer questions suitable for automatic grading.

Subject distribution:

  • Mathematics (41%):Abstract problems such as advanced mathematics, probability theory, and algorithm design.
  • Computer Science/Artificial Intelligence (10%):Machine learning theory, computational complexity, natural language processing.
  • Natural Sciences (27%):Physics (9%), Chemistry (7%), Biology/Medicine (11%), involving quantum physics, organic synthesis, pathological mechanisms, etc.
  • Humanities/Social Sciences (9%):Critical analysis questions in philosophy, history, economics, and sociology.
  • Engineering (4%) and other disciplines (9%):Covers engineering design, art history, and interdisciplinary cutting-edge issues.

Discipline Distribution

hle.torrent
Seeding 1Downloading 0Completed 85Total Downloads 400
  • hle/
    • README.md
      1.69 KB
    • README.txt
      3.37 KB
      • data/
        • hle.zip
          227.35 MB