HyperAI

NaturalProofs Mathematical Reasoning Dataset

Date

4 months ago

Size

159.82 MB

Organization

Allen Institute for Artificial Intelligence
University of Washington

Publish URL

github.com

The NaturalProofs dataset is a multi-domain corpus for studying mathematical reasoning in natural language. It was released in 2021 by researchers from the University of Washington, the Allen Institute for Artificial Intelligence, and New York University. The related paper results are “NaturalProofs: Mathematical Theorem Proving in Natural Language".

It contains about 30k theorem statements and proofs, 15k definitions, and 2k additional pages (e.g., axioms, corollaries), all written in natural mathematical language. The NaturalProofs dataset covers a wide range of data from ProofWiki, detailed data from the Stacks project, and low-resource data from mathematics textbooks. NaturalProofs unifies these sources under a common schema and provides it as a public resource to promote progress in tasks involving informal mathematics. This dataset provides a rich resource for research in mathematical reasoning and helps advance the field of natural language processing and machine learning in mathematical reasoning.

NaturalProofs.torrent
Seeding 2Downloading 0Completed 44Total Downloads 57
  • NaturalProofs/
    • README.md
      1.78 KB
    • README.txt
      3.55 KB
      • data/
        • NaturalProofs.zip
          159.82 MB